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A QUARTERLY MODEL FOR THE 
UNITED STATES ECONOMY* 


HAROLD BARGER 
Columbia University 
and 
LAWRENCE R. KLEIN 
University of Michigan 


ANY theories about economic behavior imply a belief that it can 
be represented by some system of equations whose solution is 
determinate. The econometric problem is to specify the form of the 
functions and to estimate the parameters. It may be that there exist 
more than one system adequate for the purpose; it may also be that, 
for an entire economy, even the simplest such system is too complex 
to be estimated from any known body of data. This paper describes an 
attempt to represent quarterly movements of gross national product 
in the U. 8. economy by a model with as few as three equations. Failure 
in the attempt may suggest that a more complex model is required. 
Most empirical econometric studies have been based on annual time 
series data,' the sample size seldom exceeding twenty to thirty ob- 
servations. It has frequently been suggested that econometric research 
would benefit from the use of quarterly data. The sample would be 
enlarged by a factor of four and more detailed information obtained 
about movements of the economy. Although we do not expect to get 
four times as much information by shifting from annual to quarterly 
data, we do expect to add something to our knowledge of underlying 
economic structure. For instance lags can be measured more accurately. 





* This paper describes some results of an investigation made possible by grants from the Columbia 
University Council for Research in the Social Sciences and the Social Science Research Council, to 
both of whom we are grateful. We are also indebted to Sylvia Schlachter, formeriy of Columbia Uni- 
versity, and now of the University of Michigan, who carried out the computations. 

1 See, however, Colin Clark, “A System of Equations Explaining the United States Trade Cycle, 
1921 to 1941,” Econometrica, 17 (1949), pp. 93-124, where use is made of quarterly data. 
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On the other hand we face new problems such as seasonal variation and 
increased serial correlation of disturbances. 

In this paper we shall discuss two simple quarterly models and esti- 
mate their parameters from U. S. data for the interwar period. We 
shall then test the models by extrapolating the results into the period 
since World War II. 


CHOICE OF A MODEL 


Our starting point is the three-equation model already fitted by 
Klein to annual data.? This choice was made so that we would have a 
a good basis for comparison of our results with those of an annual 
model. The three-equation model is compact and permits considerable 
experimentation. Of the following variables, all of which are measured 
in constant prices, the first six are regarded as endogenous and the 
remaining as exogenous: 


C consumer expenditures 

W, wages and salaries paid by private industry 
non-wage income or “profits” 

net private domestic investment 

year-end stock of capital 

net national income 

wages and salaries paid by government 
indirect taxes less subsidies 

government purchases plus net foreign balance 
time 

random disturbance 


Fe" ASSN 


In the annual model, equations for which are as follows, negative sub- 
scripts outside parentheses denote variables lagged by the number of 
years indicated. 


(1.1) C = ao + ar(Wi + We) + onl] + a3(M)u1 + m1 

(1.2) I = Bo + Bill + Bo(T)1 + B3(K)a + ue 

(1.3) Wi = yor n(Y + T — We) + v2 ¥ + T — We) + vat + us 
(1.4) Y+T7T=C+I/+4G4 

(1.5) Y=0+Wi+ Ws 

(1.6) I=K — (K)u 





2 Lawrence R. Klein, Economic Fluctuations in the United States, 1921-1941 (New York: Wiley, 
1950), pp. 55-80. 
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Of the three stochastic equations, (1.3) can be regarded either as dis- 
tributing income between wages and profits (inch:ding interest and 
rent), or as the demand for labor. If we prefer the latter viewpoint, a 
direct measure of private output is required; therefore we replace 
(Y+T-—W:) by (C+I+G-—W,).* The last three equations, being 
merely accounting identities, have known coefficients and are not sub- 
ject to random disturbances in behavior. Equation (1.4), however, is 
subject to errors of observation insofar as a statistical discrepancy 
exists between direct estimates of national product (expenditure 
version) and estimates of national income (factor payments version) 
converted to market prices. 

To adapt the above model so that the variables relate to quarters 
instead of years, a one-year lag needs to be replaced by a lag distributed 
over several quarters. Yet if lagged variables are introduced too freely 
into a linear scheme, intercorrelation between them may render results 
indeterminate. Therefore we grouped lagged variables in pairs (from 
this point forward, negative subscripts outside parentheses will denote 
values lagged by the number of quarters indicated) : 


(x)-n + (2)-cn4y ; 
2 





(2) cnt /2 = 


With the help of this convention we might (for instance) write the fol- 
lowing quarterly version of (1.1) to (1.6), the time unit referred to 
being three months instead of a year. 


(2.1) C= ao + a(Wi + We) + aoll + as(Il)-3/2 + tA, 

(2.2) I = Bot Pill +62(1)—3/2+83(M)—7/2+8s(1) 112+ 8s(K)1-+u2, 
(2.8) Wi = yvoty(C+I+G— We) +y(C+I+G— W2)_s2+-yst+us, 
together with (1.4), (1.5), and (1.6) above. 


THE ESTIMATION OF PARAMETERS 


The equations (2.1), (2.2), and (2.3) each contain at least two 
endogenous variables. We could, of course, estimate the coefficients 
in each equation by conventional use of least squares. To do so we 
should have arbitrarily to treat some particular variable as dependent, 
and all others (endogenous or otherwise) as independent. This procedure 
is arbitrary and, moreover, is known to lead to biased estimates of the 





3 Were it not for the statistical discrepancy in our national accounts, we would have the same 
measure of output reckoned as a sum of expenditures or as a sum of factor income payments, and the 
replacement mentioned would not affect computations. 
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parameters.‘ To be sure, in at least one case that has been investigated, 
i.e. the annual model given above, the bias proved to be smaller for 
most parameters than the estimated sampling errors.’ We do not know 
that this will always be the case. In any event the compulsion to choose 
as dependent one only among several variables, all of which clearly 
are endogenous, is unwelcome. Therefore we decided to use unbiased 
or consistent methods of estimation. 

To obtain consistent estimates of the parameters in (2.1) to (2.3), 
the equations must be regarded as simultaneous and the system treated 
as a whole. With six endogenous and nine exogenous or predetermined 
variables, estimation by the limited-information maximum-likelihood 
method is fairly laborious; yet this system is the closest analogue to 
the annual model, and computation of its parameters was undertaken.® 
The results contradicted the assumption that the disturbances (1, 
uz and us) were random. Quarterly are more highly autocorrelated than 
annual variables, and it is not surprising that the same should be true 
of the residuals from a regression between such variables. 

An obvious device is to assume that the disturbances satisfy a lag 
correlation scheme such as 


3 
us = >> pi(uja + 9 (v; mutually independent). 
i=l 
However, to estimate p;; simultaneously with the other parameters by 
the method of maximum likelihood would be burdensome. 

In the limited information method, each individual equation of the 
system uses only restrictions on the parameters of that equation and 
not those on other equations. Thus it would be possible to obtain 
limited information estimates if each disturbance satisfied a pure auto- 
regressive equation 


Us = pi(Us)i + 0; (v; mutually independent). 
Yet the computational burden imposed even with this simplification 
was more than we wanted to undertake at this stage of research. 
A RECURSIVE MODEL 


Our procedure was somewhat different. We converted the system 
of equations to recursive form, thus enabling us to obtain consistent 





4 See, e.g., T. Koopmans, “Statistical Estimation of Simultaneous Economic Relations,” Journal 
of the American Statistical A jation, 40 (1945), pp. 448-66. 

5 Klein, op. cit. Compare pp. 68, 75. 

6 For computational procedure see, e.g., T. W. Anderson and H. Rubin, “Estimation of the Parame- 
ters of a Single Equation in a Complete System of Stochastic Equations,” Annals of Mathematical 
Statistics, XX (1949), pp. 46-63. 
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estimates by repeated application of single-equation least-squares 
methods.’ It proved computationally feasible to do this while assuming 
that the disturbances satisfied first-order autoregressive equations. For 
this arrangement the first equation to be estimated must have only one 
endogenous variable, the remaining variables being predetermined or 
exogenous. Each subsequent equation must contribute only one addi- 
tional endogenous variable. In estimating an equation that contains 
more than one endogenous variable, in the case of all but one such 
variable the calculated rather than the observed values are used in 
the computations. The procedure leads to consistent estimates. In some 
recursive systems, the assumption that unlagged disturbances in 
separate equations are independent makes maximum likelihood esti- 
mates obtainable by repeated application of the method of least 
squares. If we drop these independence assumptions and substitute 
calculated instead of observed values of endogenous variables in suc- 
cessive equations, we obtain consistent but not necessarily full maxi- 
mum-likelihood estimates. 

From a formal standpoint it makes no difference which equation we 
estimate first, provided it can be written with one endogenous variable 
as a function of predetermined variables alone. We believe that in- 
vestment decisions depend upon a longer range of past experience, and 
result more slowly in actual expenditures, than other types of decision. 
Therefore we put 6,=zero in equation (2.2), estimate the remaining 
parameters in this equation by least squares, and use the calculated 
value of investment as an exogenous variable in the consumption 
equation. This treatment allows consumption to respond immediately 
to changes in income and permits consistent estimation by least 
squares. On the other hand, the distinction between wage and non- 
wage income, as in equation (2.1), has to be abandoned. On substituting 
calculated (denoted by superior ~~) for observed values of J, 


Y=C+T+G-T; 
and we may write: 
C= ao’ + a,/Y + u;’ 
ato’ a 
= aa + ey I+G@-T)+u 
= a0” + ar’ + G — T) + wr". 
The disadvantage resulting from the consolidation of wage and non- 
wage income may be partially offset by using Y’, disposable income, 
7 Herman Wold and L. Juréen, Demand Analysis (New York: Wiley, 1953), p. 14. 
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instead of Y, national income.*® We shall further introduce lagged con- 
sumption on the right hand side, to take account of the influence of 
past behavior.® Finally we substitute computed values on the right 
hand side of the wage equation (2.3) and estimate this also by the 
method of least squares. 

The above procedures yield the following fully recursive model which 
may be estimated consistently by least squares methods. 


Model A (Fully recursive) 


{ C = ao + a(C)-3/2 + as(T + G—-T'’) +m 
(3.1) 
Ba pi(ts)—1 + 


(3.2) { I = Bo + Bi(T)—~3/2 + B2(M)—7/2 + Bs(TT)—-1y2 + Ba(K)-1 + Ue 
. Us = p2(Us)—1 + v2 

oe Y = yon (C+I+G—We)+72(C+14+G—W2)-sat+rsttus 
. Us = p3(Us)-1 + U3 

(3.4) Y’+T’=YV+T7T=C+1+G4 

(3.5) Y'+7’ —T=1+Wi+ W, 

(3.6) I= K — (Kj) 

Equation (3.2) must be estimated first, and the calculated values for 

f substituted in (3.1). When (3.1) has been estimated, values of C and 

T are substituted in (3.3). The complication introduced by the auto- 

regressive treatment of the disturbances is discussed in the Appendix. 
More generally, it may be seen that the condition for recursive treat- 

ment (i.e. consistent estimation in stepwise fashion) is that the Jacobian 

of the transformation connecting the disturbances with the endogenous 


variables shall be triangular and (by the rule of normalization) equal 
to a constant, unity. Thus, in the present example 


l—-a—-%" 


O(v2, "1, U3) os 
a(I, C, Wi) 


1-wmw =], 
0 0 1 


The reader will observe that the first column of the Jacobian refers to 





8 Y’ is obtained from Y by adding net interest paid by government and government transfer pay- 
ments; and deducting personal tax payments, corporate tax accruals, and all contributions for social 
insurance (concepts follow Department of Commerce practice). In equations (1.1) and (2.1), data did 
not allow the wage and nonwage income separately to be placed upon a disposable basis. 

® This type of lag relation has been found to be satisfactory in models based on annual data. See 
T. M. Brown, “Habit Persistence and Lags in Consumer Behavior,” Econometrica, 20 (1952), pp. 355-72. 
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equation (3.2), estimated first; the second column to equation (3.1), 
estimated next; and the third column to equation (3.3), estimated last. 


A HYBRID MODEL 


We saw above that the simultaneous estimation of all three equations 
requires the assumption that disturbances are random, and that in 
practice with quarterly data this assumption is contradicted. To avoid 
this difficulty we developed a recursive model, as just explained. A 
further alternative is to use a model which is partly recursive and 
partly simultaneous. Thus we may estimate I as above and substitute 
the calculated values in the remaining two equations, and then proceed 
to estimate the latter simultaneously by the method of limited informa- 
tion (instead of successively by the method of least squares). If we take 
our consumption equation 


C = ag’ + an'(C)_a2 + a2'(Wi + We) + as'T + wy’, 
and for II write 
f=C+7T4+G-Wi-W.,-T, 
we obtain (4.1) below. 
Model B (Hybrid) 
(4.1) C= a + a(C)-s2 + oo(Wi + Wr) +a +@—- 7) +m 
{ I = Bo + Bi(Tl)-s/2 + Be(Tl)—7/2 + Bs(M)—aj2 + Ba(K)-1 + Ue 


2 
om Uz = po(Ue)-1 + V2 


(4.3) Wi yotr(C+T+G— We) +72(C+I+G—W2)-s2+yst+us 
(4.4) Y+T=C+I1+@G 

(4.5) Y=1+Wi+t W: 

(4.6) I= K — (K) 


Here consumption depends separately upon wage and nonwage income, 
as in the original annual model, and this feature makes fully recursive 
treatment impossible. Equation (4.2) is estimated by the method of 
least squares as before, but even after calculated values for I have 
been substituted, equations (4.1 and (4.3) both contain more than one 
endogenous variable. However we estimated (4.1) and (4.3) con- 
sistently by doing so simultaneously, using computational procedures 
to which reference has already been made." 





10 Anderson and Rubin, op. cit. Equations (4.1) and (4.3) do not contain autoregressive error terms 
because of the heavy additional burden of computation that their introduction would impose: see 
discussion above. 
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SEASONAL VARIATION IN QUARTERLY DATA 


In the later sections of this paper the parameters in Models A and B 
are estimated, and their predictive power tested, using seasonally ad- 
justed data throughout. The practical necessity of using corrected 
data can readily be demonstrated. In the first place, some components 
of national income (e.g. farm operators’ income) already have been 
partially corrected, and other components (e.g. business profits) have 
been fully corrected for seasonal variation during original collection or 
subsequent processing of the data; such series are not available in 
“seasonally unadjusted” form. The implications of using data, some of 
which have and some of which have not been corrected for seasonal 
movement, are obscure. 

In the second place, if seasonally unadjusted data are to be used, the 
seasonal behavior has to be incorporated into the model itself. A simple 
and plausible representation of seasonal behavior requires multiplica- 
tion by a parameter that varies only with the season. Thus if 


(5.1) Y = az; + ate +--+ 


is a structural equation in the absence of seasonal variation, to allow for 
such variation in the model we need to write 


(5.2) Y = (Bier! + Bote’ + Bases’ + Baer’) (ont: + ote +--+) tu 


where z;’ assumes the value unity in the 7th quarter and is zero in all 
other quarters. If the 8 as well as the a have to be estimated from the 
data, it will be found that the estimating equations do not readily admit 
of numerical solution since the behavior equation (5.2) is nonlinear in 
the parameters.” An advantage of using unadjusted data in this way 
would be knowledge of the number of degrees of freedom absorbed in 
estimating the 8. The adjustment of data for seasonal variation also 
uses up degrees of freedom, but we never know just how many degrees 
are absorbed in the process. 

However, theoretical superiority does not seem to lie wholly on the 
side of unadjusted data. It may be urged that the economic subject 
makes his own (rough) seasonal corrections as he goes along. The con- 
sumer does not react to his income and to the time of year as separate 
data, but asks himself, “Is this more or less than the income I would 
expect at this time of year?” The entrepreneur looks at his profits and 





1 See also L. Hurwicz, “Variable Parameters in Stochastic Processes: Trend and Seasonality,” 
Statistical Inference in Dynamic Economic Models, ed. T. C. Koopmans (New York: Wiley, 1950), 
Ch. XI. 





A QUARTERLY MODEL FOR THE UNITED STATES ECONOMY 421 


compares them with some level expected for the season. Each applies a 
rough seasonal correction he carries in his head. 

If this version of the facts is accepted, seasonally adjusted rather 
than unadjusted data appear to be the more accurate measure of the 
variables that interest us. However, the seasonal correction effected by 
the economic subject must be based wholly on past experience, al- 
though it doubtless is continually revised as history unfolds. Of course, 
seasonal corrections made by statisticians are based on data for the 
current year, earlier years, and later years. The theoretical justification 
for using standard methods to correct the raw data for seasonal varia- 
tion is therefore incomplete. The most we can say is that seasonal ad- 
justment by conventional methods enables us to approximate more 
closely the variables most relevant to economic behavior than no 
seasonal adjustment at all. 


SAMPLE DATA USED FOR ESTIMATION 


Parameters in both models A and B were estimated from 72 quarterly 
observations for the years 1923-1940. All observations were expressed 
in constant (1939) prices. Variables were defined in accordance with 
Department of Commerce usage” except that we substituted a fresh 
series for capital consumption allowances. Annuel Commerce figures 
for 1929-1938 were interpolated with data from Barger’s Outlay and 
Income in the United States* and were extrapolated back to 1921 in the 
same way. All data are seasonally adjusted quarterly totals and are ex- 
pressed in $ million in 1939 prices,“ excepting only ¢ which numbers the 
quarters consecutively. 


MODEL A (FULLY RECURSIVE) 


Investment equation. Equation (3.2) has the following coefficients 
when estimated by least squares. 





12 See Survey of Current Business, “National Income Supplements.” II comprises business profits 
(corporate and noncorporate, including income of farmers and the independent professions), and interest 
and rents, before tax. 

13 New York: National Bureau of Economic Research, 1942. 

4 The various components of gross national product were deflated with readily available price 
indexes. Capital consumption allowances were deducted from gross national product (both in 1939 
prices), yielding net national product. A comparison of the latter, quarter by quarter, with net national 
product in current prices yielded a single implicit price index which was used to deflate all components 
of income. 

4 To print the sample data would require excessive space. Magnitudes may be indicated by quoting 
the following mean values for the sample period: C, 14,538; W:, 9,784; II, 4,126; Z, 306; Y, 15,253; 
Ws, 1,343; 7, 1,674; G, 2,396; Y’, 16,173. Values of K and ¢ are zero for the last quarter of 1922; and 
22,030 and 72 respectively for the last quarter of 1940. 
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{ J = — 2016 + 0.438(IT)_3/2 + 0.615(I1)_7/2 ~ 0.205(11) 11/2 
(0.403) (0.360) (0.429) 


_ 0.054(K)_1 + U2 
(0.038) 


U2 =  0.603(u2)-1 + % 
(0.211) 


In parentheses are shown estimates of the sampling errors, whose exact 
distributions are not known. While sampling errors for individual co- 
efficients of (II)_s/2, (II)-7/2, and (I1)-11;2 are quite large, the corre- 
sponding error for their sum is smaller than any of the separate errors, 
being 0.294. The statistic 5?/s? (=2.016) may be computed as the ratio 
of the mean square successive difference of the residuals ve to their 
variance. The distribution of the statistic is such that for a random 
series with (say) 60 degrees of freedom the probability is 0.95 that 6/s? 
will exceed 1.6.'* Hence the result is compatible with the assumption 
that the disturbances are random. 

Consumption equation. As the next step we substitute f from (6.1) on 
the right hand side of (3.1). Estimation of the latter equation yielded 
results compatible with the assumption of random 1, but reported a 
negative value for ae. This result implies that consumption depends in- 
versely on current income, a conclusion we rejected. Deciding that a2 
must be positive, we put p; identically equal to zero, and contented 
ourselves with the estimate 


(6.2) C = 266 + 0.990(C)_s2 + 0.03677 + G — 7’) +m 
which is equivalent to 
(6.3) C = 257 + 0.955(C)-/2 + 0.035Y’ + um’ 





for which 62/s2= 1.35, indicating significant autocorrelation of the re- 
siduals. An autoregression coefficient computed from the observed re- 
siduals of (6.2) was »1=0.329. 

Because of the transformation of variables, the confidence intervals 
for the coefficients in (6.3) are not symmetrical about the point esti- 
mates. If a’ is the coefficient of (C’)_3/2 and az’ of Y’ in (6.3), the limits 
at the 95% level are approximately 





1B. I. Hart and J. von Neumann, “Tabulation of the Probabilities for the Ratio of the Mean 
Square Successive Difference to the Variance,” Annals of Mathematical Statistics, XIII (1942), pp. 
207-14. We of course use only the lower tail of the distribution. 
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0.931 S a,’ S 0.984; and — 0.072 S a’ = 0.122. 


For Model A this consumption function was the least implausible we 
obtained in a number of trials, with and without lagged values of con- 
sumption, with and without autoregressive transformation of disturb- 
ances. Plainly it suffers from three serious defects. (1) The residuals 
cannot be considered random. (2) The low value of a2 (or a’) furnishes 
but a very weak link with the investment equation. (3) The confidence 
limits for ag (or ae’) include negative values. 

The marginal propensity to consume. Despite the fact that (C)_3/2 is 
the dominating variable in (6.3), Y’ does play a larger role as the time 
over which the equation functions is lengthened. Let 


(C); = aot a(C)o + a2(Y’), + (u)1. 


Then 
n—1 n—1 n—1 
(C)n = ao > a? + a"(C)o + ae _ a *(Y")n-i + ye a *(U) ni. 
im0 im0 i=0 


In the limit, if we assume a steady income stream, we have 


(6.4) Ca hE es 


1— a 1 — a 





where v is a linear combination of random variables. Evidently we may 
regard a2/(1—a,) as a long-run marginal propensity to consume, its 
computed value being about 0.78. We therefore report a sizeable in- 
fluence of income upon consumption in the long run, even though (6.3) 
has the appearance of almost pure autoregression. 

We can also construct a joint confidence region for a; and a in the 
form of an ellipse, to see whether a2/(1—a,) is estimated in the range 
(0, 1) even though az is not itself significantly positive.” Our estimate 
of the propensity to consume proves very rough, for about one-tenth 
of the area of an ellipse drawn at the 95% level admits negative values. 

The wage equation. The calculation of (3.3) yielded an estimate of ps3 
practically equal to unity and substantial autocorrelation of the re- 
siduals v3. Moreover (7i+2) was estimated well below 0.4, although 
annual models have shown the marginal influence of (C+I+G— Wz?) 
to be between 0.5 and 0.6. Neither estimation of the equation in terms 
of first differences (i.e. putting p;3 identically equal to unity), the addi- 





17 An example of the preparation of this type of confidence region is given in T. Haavelmo, “Meth- 
ods of Measuring the Marginal Propensity to Consume,” Journal of the American Statistical Associa- 
tion, 42 (1947), pp. 105-22. 
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tion of longer lags for the independent variable, the insertion of higher 
powers of ¢, nor the use of a second-order autoregressive scheme for the 
disturbances, led to improved results. Nor were lagged values of the de- 
pendent variable helpful on the right hand side of the equation. Prob- 
ably the distribution of income cannot be satisfactorily approximated 
by any simple linear relation. In the present study we contented our- 
selves with the following equation in which p; is identically zero: 


(6.5) W1=665+0.759(C+7+G—W2) —0.178(C+1+G—W3)-s2 
(0.210) (0.212) 
— 0.953¢ + us 
(4.3) 


The estimated coefficients are reasonable enough, but 6?/s?=1.21 (i.e. 
clearly <1.6). The residuals therefore are autocorrelated, and in fact 
yield an autoregression coefficient of 0.401. Despite a common belief 
that the distribution of income was shifting during the period that we 
chose for our sample, the coefficient of ¢ proves not to be significantly 
different from zero. 


MODEL B (HYBRID) 


Investment equation. It will be recalled that equations (4.2) and (3.2) 
are identical, investment being estimated in the same fashion in both 
models. Calculated values, f, are therefore obtained from (6.1) above. 

Consumption and wage equations. Parameters in equations (4.1) and 
(4.3) need to be estimated by the simultaneous treatment of these two 
equations.'* The results, shown as (6.6), (6.7), and (6.8), are consistent 
but not efficient. Because endogenous variables appear on the right 
hand side, estimation of (4.1) and (4.3) independently by least squares 
leads to biased results. Estimates of the coefficients obtained separately 
for each equation by least squares are shown in (6.7) and (6.8) in 
parentheses above the unbiased estimates, and suggest that the bias is 
not quantitatively important in the present case. Sampling errors of the 
unbiased estimates are shown in parentheses below the latter. 


(6.6) C=1061+0.721(Wi+ W:2) +0.383(C)_s2—0.1927+G—T)+u 
(0.174) (0. 146) (0.065) 
which is equivalent to 
(1375) (+0.696) (+0.443)  (—0.257) 
(6.7) C = 1313 +0.655(Wi+W2)+0.474(C)_s.—0.2381 +1; 


18 For computational procedure, see Anderson and Rubin, op. cit. 
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(570)(+0.546) (—0.041) 
(6.8) Wi = 604 +0.706(C+7+G—W2)—0.117(C+1+G—W2)_s2 
(0.098) (0.096) 

(—1.67) 
—2.815t+- us 
(1.97) 


As before we calculated the residuals and estimated p from them. 
For (6.6) 


oS 1.67 pi = 0.177, 
8 


and for (6.8) 


2 
—=1.45 p; = 0.288. 
3? 
For the consumption equation the result is just compatible with the 
assumption of random disturbances, but the wage equation, as in 
Model A, yields residuals that show significant autocorrelation. As in 
Model A, the investment equation is not solidly linked to the others: 
although lagged consumption plays a smaller role than in (6.2) and 
(6.3), the coefficients of f in (6.6) and of fi in (6.7) have the wrong sign. 


TESTING THE MODELS 


How well do our models represent behavior, first during and second 
outside the sample period? The acid test is the model’s ability to pre- 
dict. Of course the forecasts of models such as ours are conditional, in 
the sense that the values of the exogenous (but not of the endogenous) 
variables must be known for the period to which the forecast applies. 
To be sure the practical forecaster may regard this as a fatal disad- 
vantage; here, however, we are concerned with prediction, not for its 
own sake, but only as a means of testing the models. The sample period 
comes to an end in 1940. Accordingly we shall test the model by making 
predictions for 1941 and for 1947 through 1952, omitting the war years 
as not relevant for the purpose in hand. 

The procedure is as follows. We can estimate the values of the endog- 
enous variables for any quarter (say, lst of 1947) on the basis of 
observed values of all quantities for preceding quarters (through 4th of 
1946) and data for exogenous variables only (W:2, T or T’, and G) for 





426 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 19% 


the current quarter (lst of 1947), and compare the estimates so ob- 
tained with the observed values of the endogenous variables for the 
current quarter (1st of 1947). We can repeat this for the 2nd quarter of 
1947, using complete data for the first quarter and exogenous variables 
for the second quarter, and so on. Thus we build up a succession of 
short-range predictions, each one quarter ahead. The question may 
then be posed, whether the performance of the model is appreciably 
better than guesswork. 

Solution of the three equations for each model, together with the 
accounting identities, yields estimates of six quantities: gross national 
product (GNP), national income, and the two endogenous com- 
ponents of each (J, C; Wi, II). Of course the six predictions are not in- 
dependent, for the accounting identities allow any three predicted 
variables to be derived from the remaining three. Because the invest- 
ment equation stands by itself, in that it predicts a single endogenous 
variable in terms of predetermined variables, and because of the obvi- 
ous interest attaching to predictions of GNP, we shall for the sake of 
brevity report the quarter-by-quarter tests only for investment and 
GNP.” The outcome of the long-range extrapolation to be mentioned 
later will be summarized for all six variables. 

In the following tabulations, s is the root mean square difference be- 
tween predicted and observed values; p the proportion of cases in 
which the direction of change from the preceding quarter (i.e. the sign 
of the first difference) is correctly predicted; and P the probability of 
so high a proportion of directions of change being correctly predicted 
by chance, on the assumption that correct and incorrect predictions 
are equally likely, and that probabilities of correct prediction in succes- 
sive quarters are independent. Clearly where variables are known to be 
autocorrelated, guesswork can easily improve upon the toss of a coin, 
and probabilities of correct prediction in successive quarters no longer 
are independent. Therefore we have computed p and P, not only for all 
quarters taken together, but also for those particular quarters in which 
the variable to be predicted was later observed to have reversed its di- 
rection. For this more stringent test, the chances of correct prediction 
in individual quarters appear to be completely independent. In simple 
language, it perhaps is not too hard to guess that an upward or down- 





19 To predict GNP, we also need to know depreciation, assumed to be predetermined. 

20 Results of quarter-by-quarter tests for consumption and national income were similar to those 
about to be given for investment and GNP—as indeed they must be in view of the accounting identities. 
Short-range predictions for wage and nenwage income were inferior and could be more readily explained 
by chance, a result perhaps connected with the shift in the distribution of income since World War II 
referred to below. 
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ward sweep will continue; it is less easy correctly to predict reversals of 
direction. 

Predicting investment. The investment equation (6.1), common to 
both models A and B, yields the results in Table 1. In predicting 28 
of the 40 reversals of direction during the sample period, the investment 
equation obviously does much better than could have been achieved by 
accident. The result during 1947-1953 is less impressive, for only 7 of 9 
reversals were correctly predicted, the chance of so good a result occur- 
ring by accident being about 1 in 10. 


TABLE 1 


INVESTMENT: SUCCESSIVE QUARTERLY PREDICTIONS 
(s in $ million at 1939 prices) 








Quarters in Which Direction 
All Quarters of Change was Reversed 





8 p P Pp P 





1923 to 1940 
(sample period) 


477 51/72 .001 28/40 .008 
1941 
343 3/4 31 0/1 1.00 
1947 to 1952 
567 17/24 .032 7/9 .09 





Since the investment equation (3.2) can be written in the form 
T = px(I)1 + Bo(1 — p2) + 6: [+2 — pa(T)-s2] 
+ B[(I1)—1/2 — p2(I)-9/2] + Bs[(M)—s1/2 — p2(T1) x2] 
+> Bs[(K)-1 _ p2(K)-2] + tv, 


we can estimate the variance of forecast values of J as a sum of prod- 
ucts, each product being obtained by multiplying an estimated variance 
or covariance of 6, B2, 83, 84, and pe (or of their products) by a square or 
product of values of predetermined variables in the forecast period, to- 
gether with an estimate of the residual variance.*! Taking as values of 
the predetermined variables their mean level during the 24 forecast 
quarters 1947-1952, we can approximate a standard error of forecast 





2 See, e.g., H. Hotelling, “Problems of Prediction,” American Journal of Sociology, 48 
(1942-43), pp. 61-76. 
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for the postwar period of $907 million (1939 prices). The root mean 
square of the observed errors of prediction for 1947-1952 (s in Table 1) 
was somewhat smaller than this. 

Predicting gross national product. The performance of the complete 
models A and B is conveniently tested by predicting GNP.” We did 
not estimate the standard error of forecast for GNP because of the 
heavy computation required. But we carried out the other tests de- 
scribed above, both for models A and B and for three “guesswork 
models” of progressively increasing sophistication. Call GNP y and in- 
dicate lags, as before, by suffixes. Assume first that GNP will be the 
same this quarter as last quarter: 


(Guesswork Model I) y = (y)-1. 


Embodying the simplest possible assumption, this model cannot of 
course be used for forecasting direction of change, but it affords a cri- 
terion for measuring errors of prediction. 

Assume, second, that GNP’s change from last quarter will be one- 
half GNP’s change last quarter from the preceding quarter: 


(Guesswork Model IT) y = (yit+4ly)a — (y)-]. 


The fraction one-half is chosen arbitrarily on the assumption that for a 
stable series such as GNP the autoregression coefficient between first 
differences lies between 0 and 1.” 

Thirdly, let us fit a second-order difference equation to the observed 
values of GNP during the sample period (1923-1940) : 


(Guesswork Model ITT) y = 4828 + 0.873(y)1 — 0.125(y)-». 


Model III is perhaps no longer pure guesswork, but it still will serve as 
a standard of performance against which to test our two econometric 
models A and B. Results of the tests are shown in Tables 2, 3, and 4. 
During the sample period, both of the econometric models fit the ob- 
served data markedly better than any of those based on guesswork 
(Table 2). However, in predicting reversals of direction, only Model A 
performs significantly better than might be expected from chance. 





22 GNP equals (C+I+G) together with depreciation, all in 1939 prices. C and I are predicted by 
the model; G and depreciation are exogenous. When predicting with models A and B, we use the re- 
spective values of p: and p: obtained from residuals observed during the sample period and quoted 
earlier. 

2% Carl Christ and Milton Friedman have each endorsed a coefficient of unity for this test (National 
Bureau of Economic Research, Conference on Business Cycles, New York, 1951, pp. 57, 110). But unity 
is an extreme value in the context, and seems unlikely to give as good predictions as some coefficient 
near the middle of the range (0, 1). 
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TABLE 2 


GROSS NATIONAL PRODUCT: SUCCESSIVE QUARTERLY 
PREDICTIONS, 1923-1940 (SAMPLE PERIOD) 


(s in $ million at 1939 prices) 








32 Quarters in Which 
All 72 Quarters Direction of Change 
Was Reversed 








8 P P p P 
Econometric 
Model 
A 652 54/72 .0001 24/32 .003 
B 708 51/72 .0002 20/32 .108 
Guesswork 
Model 
I 760 36/72 5 16/32 .57 
II 805 40/72 2 0/32 1.00 
III 939 36/72 5 15/32 iz 





During 1941—the year immediately following the close of the sample 
period—GNP rose steadily, so that no reversals of direction occurred 
(Table 3). Here, because the period contains no turning point, Guess- 
work Model II performs as well as the econometric models. 


TABLE 3 


GROSS NATIONAL PRODUCT: SUCCESSIVE QUARTERLY 
PREDICTIONS, 1941 


(s in $ million at 1939 prices) 











8 p i 
Econometric Model 
A 425 4/4 .06 
B 1,857 4/4 .06 
Guesswork Model 
I 1,118 2/4 on 
II 594 4/4 .06 
III 3,115 0/4 .94 





We also carried out the tests for the postwar period, with the results 
shown in Table 4. Again the econometric models perform much better 
than any of the attempts at guesswork. Although during 1947-1952 
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TABLE 4 


GROSS NATIONAL PRODUCT: SUCCESSIVE QUARTERLY 
PREDICTIONS, 1947-1952 


(s in $ million at 1939 prices) 








11 Quarters in Which 
All 24 Quarters Direction of Change 
Was Reversed 








8 p a Pp P 
Econometric 
Model 
A 651 19/24 .003 8/11 .113 
B 1,087 18/24 O11 9/11 .033 
Quesswork 
Model 
I 678 12/24 .58 53/11 .61 
II 687 13/24 .42 0/11 1.0 
III 4,499 6/24 .99 5/11 .73 





they perform less satisfactorily than during the sample period, both 
Models A and B show significant predictive power if all 24 predictions 
are considered independent. On the other hand if the test is confined to 
the prediction of reversals of direction in GNP, the result is less clear 
cut. For Model B, chances are 30 to 1 against accidentally calling 9 of 
the 11 turning points; but for Model A the chances are only 10 to 1 
against calling 8 of the turns. Yet the consistently superior performance 
of Model A, when measured by root mean square error of prediction, is 
noteworthy. 


A LONG-RANGE EXTRAPOLATION 


Instead of a series of short-range predictions, each one quarter into 
the future, we may project a single extrapolation as far ahead as we 
please, provided only that the exogenous variables are known or can 
be estimated. Suppose a computer to sharpen his pencil after close of 
business on December 31, 1946. He is supplied with the model, with 
data for all variables up to and including the quarter now ended, and 
with (correctly) anticipated values of the exogenous variables for each 
quarter through the end of 1952. His forecasts of investment and 
GNP (both in 1939 prices) are shown in Charts I and II. Predicted in- 
vestment rises sharply to a peak in 1948 two quarters earlier than the 
downturn in actual investment. Predicted investment then declines 
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Billions of 1939 dollars per quarter 
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Cuart I. Net Private Domestic Investment, as actually observed and as pre- 
dicted by Models A and B. The predictions are assumed to be made as of Dec. 
31, 1946. 


steadily for three years, missing entirely the pre- and post-Korea boom, 
turning up again only in the summer of 1951. In the case of GNP, the 
upward trend from 1947 to 1952 is correctly predicted, but the wave- 
like movement forecast for 1949-50 did not eventuate. 

Summary totals for each variable—actual and predicted—for the 
six-year period are shown in Table 5. 

For GNP, root mean square differences (s) between prediction and 
observation for the 24 quarters are: Models A and B, $2.1 and 2.7 bil- 
lion respectively; Models I, II, and III, $3.9, 5.6, and 15.8 billion re- 
spectively. 

Although the econometric models predict six-year totals for GNP 
and national income better than any of the guesswork models, the 
same is not uniformly true of the components. Indeed Guesswork 
Model I, whose performance elsewhere was so indifferent, happens 
here to score a bulls-eye in forecasting the six-year total for investment! 
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Billions of 1939 dollars per quarter 
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Cuart II. Gross National Product, as actually observed and as predicted by 
Models A and B. The predictions are assumed to be made as of Dec. 31, 1946. 


TABLE 5 
LONG-RANGE PREDICTIONS, 1947-1952 


(Six-year totals in $ billion at 1939 prices) 
Assumed date of forecast: December 31, 1946 











I C GNP W, II Y 

As actually observed 41 576 855 451 171 694 
Econometric model 

A 60 566 865 421 210 704 

B 63 584 885 434 218 724 

Guesswork model 
I 40 546 790 435 143 650 
II 27 522 740 419 138 622 


III not computed 493 not computed 








— 
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Even so unsophisticated a guess must sometimes turn out correct. 
Some of the errors of the econometric models can readily be ra- 
tionalized. Indeed, the diagnosis of errors is very necessary if models 
are to be improved. In the present instance, the overstatement of in- 
vestment by the econometric models may be explained by our use of 
nonwage income before taxes, II, as independent variable; taxes (es- 
pecially here corporate taxes) are notoriously higher today than during 
the sample period.?4 Moreover II itself is overstated, and this is related 
to the understatement of W,; a shift appears to have occurred in the 
distribution of income since the close of the sample period. To adapt 
the models to take account of these particular changes would not be 
difficult, but such revisions lie outside the scope of this paper. 


COMPARISON OF ANNUAL WITH QUARTERLY MODELS 


Annual data cover up movements occurring within the year. A model 
fitted to annual data contains less information than a quarterly model. 
The former will not yield quarterly information without many arbitrary 
assumptions, but a quarterly model can always be converted to an 
annual basis by summation. 

To approximate an annual investment equation, we combine (3.2) 
and (3.6), putting p2=zero and using the values for other parameters 
given in (6.1). We obtain 


N+ (Det itl 
= A + 0.438(II)s/2 + 0.414(I1) 12 + 1.007(FI)-1/2 
+ 0.953(I1)_s/2 + 0.346(I1)_s/2 + 0.327(I1)_22 — 0.184(II)_9/ 
— 0.174(M)-uy2 — 0.198(K)1 + U 


where A is a constant and U is a linear combination of disturbances 
which for the present purpose we may consider random. Time zero 
is February 15 of the year for whose four quarters investment is esti- 
mated. Taking (II)3;2 and (II)1,2, together with one-half (I7)_1,2 as relat- 
ing to the current year, dividing the remaining lagged values of II be- 
tween the two preceding years, summing the coefficients, and dividing 
by 4, we obtain the following annual equation (time zero becoming 
June 30): 





% At the time our model was estimated, no adequate breakdown of income taxes by kind of income 
was available. Annual estimates for disposable income have only recently been compiled in a fashion to 
segregate wage from other income: see Lenore Frane and L. R. Klein, “The Estimation of Disposable 
Income by Distributive Shares,” Review of Economics and Statistics, Vol. XXXV (1953), pp. 333-37. 

% To derive this result we substitute [K —(K)_.] for I in (6.1) and compute [K:—(K)-_.:] by re- 
cursion starting out with the expression for K, obtaining (K):, then (K)s, and then (K);. 
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(7.1) I = A +0.3411 + 0.51(I1)_1 — 0.07(I1)-2 — 0.20(K)_1 + U. 


This compares with the following equation obtained directly from an- 
nual data:™ 


(7.2) I= A+ 0.230 + 0.55(M)1 — 0.15(K).1 + U. 


The long run interpretation of the quarterly consumption equation 
(6.3) has already been discussed. To convert this conveniently to an 
annual basis, write (C’)_32=3[(C)1+(C)_2]. After some substitution 
and rearrangement the following is obtained (lags refer to quarters) : 
(C)s + (C)2 + (Ci + C 

= A + 0.607(C)_1 + 0.877[(C)_2 + (C)-s + (C)-4] 
+ 0.270(C)_s + 0.035(Y)s + 0.051(Y)2 + 0.076(Y); 
+ 0.096Y + 0.061(Y)_1 + 0.044(Y)_: + 0.020(Y)_s 
+ U. 


where as before A is a constant and U is a linear combination of vari- 
ables that may be considered random. Summing coefficients applicable 
to respective years and dividing by 4 we obtain as an annual consump- 
tion equation: 


(7.3) C = A + 0.81(C)_; + 0.07(C)_2 + 0.06Y + 0.03(Y)_1 + U. 


The autoregressive component, although not quite so overwhelming 
as in the quarterly equation from which it is derived, still is far stronger 
than that obtainable directly from annual data:?” 


(7.4) C =A + 0.46(C)_, + 0.51Y + 0.03¢ + U. 


The consumption equation in Model B may also be adapted to an 
annual basis. A procedure strictly analogous to that used with equation 
(6.3) when applied instead to (6.7) yields the following annual con- 
sumption equation: 


C = A + 0.12(C)_1 + 0.01(C)_2 + 0.89(Wi + W2) 
(7.5) + 0.20(Wi + W2)-1 — 0.321 — 0.07(1)_1 + U. 


The closest comparable equation obtained directly from annual data re- 
ports positive coefficients for II and (II)_,:%8 


(7.6) C= A +0.80(Wi + W:) + 0.0211 + 0.23(I1).1 + U. 


% Klein, op. cit., p. 68. 
27 Carl Christ, “A Test of an Econometric Model for the U.S.,” Conference on Business Cycles, p. 75. 
28 Klein, op. cit., p. 68. 
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The two wage equations yield the following annual versions. From 
(6.5): 


(7.7) Wi =A+065(C +1+4+G — W.) — 0.0710 +I1+G — Wi) 
+ i6¢ + U. 


From (6.8): 


(7.8) Wi = A+ 0.63(C +1+G — W,) 

— 0.04(C + 1+ G4 — W2)1 + 45¢ + U. 
Computed from annual data:?° 
(7.9)W, = A+0.42(C +I+G — W,) 

+ 0.16(C +72+ G — W,)_1 + 130¢ + U. 


It will be seen that the sum of the coefficients of (C+I+G—W,), this 
year and last year, is practically the same in all three equations. How- 
ever individual coefficients differ somewhat, and the time trend re- 
ported from annual data is much stronger than in either of our quar- 
terly models. 


CONCLUSION 


To the question whether it is possible satisfactorily to represent quar- 
terly movements of gross national product in the U. 8. economy by as 
simple an equation system as that discussed here, these results offer no 
conclusive answer. 

An inspection of the two models described reveals three main weak- 
nesses. (1) In each model at least one of the three equations showed 
significant autocorrelation of residuals. (2) In the recursive model (A) 
the coefficient of income in the consumption equation is small, and its 
significance could not be established, so that the linkage between the 
first two equations in this model is poor. The hybrid model (B) shows 
a different, though probably related weakness—the coefficient of non- 
wage income in the consumption equation is negative. (3) In both 
models the sampling errors of some coefficients are uncomfortably 
large. 

Constructed from data for 1923-1940, the models were tested by 
their performance during 1947-1952. (1) In a series of short-range pre- 
dictions one quarter ahead both models performed uniformly better 
than guesswork, but their superiority was not decisive in a statistical 
sense. Probabilities against results being obtained by chance ranged 





29 Ibid. 
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from about 10 to 1 to much higher and clearly significant odds, de- 
pending upon the extent to which successive predictions were assumed 
to be independent. (2) In a single long-range prediction by each model 
for the entire six-year period, the econometric models forecast GNP 
with a smaller root mean square error than any of the guesswork 
schemes. In terms of components of GNP it is not possible to say that 
the models did appreciably better than guesswork. 

Evidently room for improvement is large. The obvious advantage of 
more plentiful observations, both in constructing and testing of models, 
when quarterly data are used will not be fully realized until further 
progress has been made with the autocorrelation problem. It may well 
be that some of the defects of the models discussed can be overcome 
only through the use of a more detailed and complex system of equa- 
tions. 

APPENDIX 


The equations for obtaining least-squares estimates of the param- 
eters a; of an equation with autocorrelated disturbances 


(8.1) L=atagat --++ arta tu 
u = p(u)itov (v random) 


are obtained by choosing that set of estimates of a; and p which mini- 
mize > :7v*. The sample observations cover the period t=1, 2, - - - 7. 
We can rewrite (8.1): 
(8.2) 2 — p(x)-1 = ao(1 — p) + afar — pla] + --- 

+ an [Zn ‘ait p(Zn)—1] + v. 


(It will be noticed that if p=1, the case is equivalent to the use of first 
differences.) Assume that all variables are measured from their means, 
and write 


X =2-— p(z)a 
Zi = es p(z:)-1. 


Minimization of > 37»? with respect to a; then leads to the familiar 
equations 


T 


T 
a>) (ZiZi) + +++ + and, (ZnZi) 


1 


4 
(8.3) >, (XZ) 


$=1,2,-+-n. 


We proceed to minimize > ,7v? with respect to p and obtain 
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? 

(8.4) z [x — a2; Se a hid anZn| 
1 


[ — (x)ia + on(asyin + ++ + + een(2n)a] = 0. 


The sums of (8.3) are ail quadratic expressions in p, so that we can 
solve for a; in terms of p. The solution is in each case the ratio of two 
polynomials, each of degree 2n in p. The coefficients are throughout 
combinations of moments of sample observations. Inspection of (8.4) 
shows that the highest order terms will be of the form a;a;p. Therefore 
substitution of the polynomials obtained from (8.3) in (8.4) will lead 
to a polynomial of degree (4n+1) in p. 

Although of high degree (17th degree in the case of our investment 
equation, which has four independent variables) these equations in p 
are not hard to solve numerically by iteration, because the relevant 
(and often the only real) root must lie in the interval l12=p2—1 at 
least for series where the end effect may be neglected. For if u satisfies 


u=p(uity 
then 
Eu(u)1 = pE(u)?.1 + Ev(u)i. 


Now (u)_; can be expressed as a constant plus a linear combination 
of past values of v up to (v)_:. Hence if the v are mutually independent, 
Ev(u)_1 must vanish and 


= Eu(u) 
~ B(u)24 


which is the autocorrelation coefficient for u in a long series. 

Once (8.4) is solved for p, &; may be obtained by substituting p in 
(8.3). 

If the residuals in the structural equation are assumed to satisfy a 
second or higher order difference equation, estimation is a much more 
difficult matter, since simultaneous nonlinear equations must be solved. 





(8.5) , 





PROBLEMS OF COORDINATING THE UNITED STATES 
STATISTICAL SYSTEM 


Sruart A. Rice 
U. S. Bureau of the Budget 


HE statistical system of the United States embraces a great many 
"TV oticiat, semi-official, and unofficial agencies and instruments. To- 
gether these “comprise a system in the same sense that the activities 
of four and one-half million business units comprise a national economic 
system.”! Elsewhere I have defined the terms here discussed as follows: 
“A statistical system exists when coherence is established and main- 
tained among the separate programs that compose it. Such coherence 
requires an item-to-item adjustment of each task and process to every 
other related task and process, whether the relationship be one of con- 
ceptual congruity or one of consistency in operational patterns and 
sequences. The process of attaining and maintaining this coherence is 
called ‘co-ordination’.”? 

The end sought is a better integration of our nation’s statistical in- 
telligence. Why is this desirable? I suggest two closely related reasons: 
First, an integrated system is more efficient. Second, it gives us a better 
understanding of the world in which we live. 

Our world is precariously balanced between forces which are further- 
ing advances in civilization and others which are pushing us toward 
universal catastrophe. The balance among them can easily be tipped 
in one direction or the other. We cannot afford to blunder because of 
an inadequate understanding of the forces with which we deal. Nor can 
we afford to spend a single taxpayer’s dollar to lesser advantage than 
we might, when the demands upon our Government and upon the 
whole economy are so numerous and so pressing. If spent for statistics, 
that dollar should produce the greatest possible yield of useful informa- 
tion. 

The needs of users of statistics are seldom limited to a single series. 
For example, they may need to know simultaneously the facts about 
employment and production—not separately but in relationship. The 
data used, whatever their separate sources, should intermesh. Employ- 
ment and production series will not intermesh unless the definitions of 
reporting units correspond and unless they are grouped in accordance 





1 Stuart A. Rice, “The Role and Management of the Federal Statistical System,” The American 
Political Science Review, XXXIV (1940), p. 481. 

2 Ibid., “Co-ordination of Federal Statistical Programs,” The American Journal of Sociology, I 
(1944), p. 22, 
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with the same industrial classification. Not so long ago, as the life of a 
bureaucrat goes, both “employment” and “industries” were separately 
and often inconsistently defined and classified by different Federal 
agencies. Hence, basic steps toward the integration of our system of 
statistical intelligence were taken with the development, promulgation, 
and incorporation into general use of the standard industrial classifica- 
tion. Other steps were the standardization of the types by status of 
persons included in the economically active population and the estab- 
lishment of uniformity in reporting periods for employment. 

The importance of such standards is illustrated in reverse when, in 
ignorance of them, investigators of economic and social problems 
formulate their own categories. Too late they may discover that their 
results are noncomparable with basic data, like those of the Bureau 
of the Census, with which to have significance they should be aligned. 

The relative merits of two general methods of integrating a statistical 
system have long been debated. One is administrative centralization, 
achieved with such conspicuous effectiveness in Canada. The other is 
decentralization, accompanied by coordination, exemplified by the 
statistical system of the United States. After many discussions of these 
assumed alternatives it is my judgment that the issues between them 
are largely unreal. In any event I disclaim partisanship. The statistical 
system of any country is an outgrowth of historical development. 
Within it are strains toward adaptation to the political, social, and 
economic structure which it serves and of which it is a part. 

The United States has developed an over-all pattern of statistical 
decentralization from which it is now too late to depart. Federal sta- 
tistical activities began to proliferate in a decentralized fashion im- 
mediately after the birth of the republic and the trend thus established 
has continued. The pattern is remarkably adaptive to its milieu. The 
uniquely large volume of information produced by the statistical sys- 
tem of the United States reflects the factual-mindedness of our people. 
Statistics have been in demand and the demand could not have been 
satisfied so easily or so well except under the conditions provided by 
the historical decentralization of the nation’s statistical mechanisms. 

Other considerations support the efficacy for us of our own decen- 
tralized system. It permits a useful division of labor among agencies, 
public and private. It keeps the collection of many data in close contact 
with the uses to which they are to be put when assembled, thus giving 
protection and assurance to administrators in the fields of utilization. 

Nevertheless, without a central mechanism for the coordination of 
statistical activities the scene presented in a decentralized statistical 
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system would be chaotic. In this country the central agency of statisti- 
cal coordination, known as the Office of Statistical Standards, is a part 
of the Bureau of the Budget in the Executive Office of the President. 
Its legal powers, affecting other Federal agencies, are more authorita- 
tive than it usually cares to assert; and its interests and concern, 
though not its legal authority, extend to numerous statistical activities 
which are under private direction. 

The limitation of the formal authorities of the Office of Statistical 
Standards to Federal agencies does not impede the integration of the 
nation’s statistical system. True, much statistical work is carried on by 
trade organizations and business units which are highly competitive. 
However, the dominance of the Federal Government in statistics and 
the prestige and influence of such Federal agencies as the Bureau of 
the Census and the Bureau of Labor Statistics are so great that busi- 
ness-produced statistics are coordinated informally to a very sub- 
stantial extent. 

Within the statistical system the problems of coordination are nu- 
merous and I shall have to offer a selection. 

1. The first is the problem of representing the general interest. It is 
naive to assume that governments are monolithic. The definite article 
before the noun—‘“the Government of the United States”—reflects an 
ideal that does not fully exist in actuality. The ensemble of Federal 
statistical agencies is not dissimilar to a trade association. The in- 
dividual Federal agencies, like the individual members of a trade 
organization for an industry, are competitive but recognize many 
common interests. To carry the analogy farther, the Office of Statistical 
Standards might be regarded as the secretariat of the trade association 
which serves the Federal statistical products industry. 

Customary procedures for financing and administering the separate 
members of this Federal statistics trade association tend to emphasize 
the autonomy of its members. The preparation of estimates of ex- 
penditures, their defense before the Bureau of the Budget and com- 
mittees of the Congress, and the administration of approved programs 
are all responsibilities of the separate agency units to which funds are 
specifically appropriated. These procedures make statistical coordina- 
tion more difficult, for an integrated statistical system should serve public 
and governmental interests which override or lie between the interests and 
responsibilities of particular agencies. 

Even when a Federal agency is persuaded to include in its estimates 
items of expenditure for purposes beyond its direct responsibilities, 
such items are inevitably the first to suffer when reductions are made, 
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whether in the budgeting, the appropriating, or the administering 
stages. The OSS recognizes an obligation to serve as a “public de- 
fender” for general interests in statistical data; but recognition of this 
obligation in the Congress and even within the Administration itself 
is very limited. Neither specific legal authorities nor the sanctions 
conveyed by custom are present to lend weight to its appeals. When it 
has “gone to bat” in public for comprehensive objectives, as with the 
“reconversion statistics program” of 1945, the consequences have not 
been encouraging. 

The reasons are not far to seek: First, it is contrary to tradition for 
representatives of the Budget Bureau to appear before committees of 
Congress on behalf of appropriations. Secondly, the Bureau’s functions 
are properly inconspicuous and anonymous. They do not build up the 
type of public support which, in the case of other agencies, is sometimes 
mobilized behind programs. 

Such considerations as these led the Mills-Long task force on sta- 
tistics to recommend to Mr. Herbert Hoover’s first Commission on 
Organization of the Executive Branch of the Federal Government that 
the OSS be put in possession of free funds for disbursement at its dis- 
cretion on behalf of Federal statistical interests. This is not the place 
to outline the objections to this solution. 

A start has been made in another direction toward accomplishing 
the same objective. Each year the Office of Statistical Standards pre- 
pares a so-called “statistical budget,” recommending programs to be 
undertaken by the principal general-purpose statistical agencies: 
Bureau of the Census, Bureau of Labor Statistics, Bureau of Agricul- 
tural Economics, Office of Business Economics, National Office of Vital 
Statistics, and such joint enterprises as that on financial statistics of the 
Federal Trade Commission and the Securities and Exchange Com- 
mission. The purpose is to secure over-all balance and thus achieve 
a Federal rather than a series of unrelated departmental programs. 
This “statistical budget” is reviewed in the Bureau of the Budget like 
any other Federal proposal. 

The “general interest” has also been represented and furthered by 
the Office of Statistical Standards in various other ways. I have already 
mentioned some of the standard classifications and definitions that it 
has developed cooperatively with the “operating” statistical agencies 
for the use of all of them. For the Joint Committee on the Economic 
Report of the Senate and House of Representatives and for the Council 
of Economic Advisers it has prepared analyses of the gaps in our na- 
tional arsenal of statistical information. It was our privilege some 
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years ago to initiate the development of the monthly publication 
Economic Indicators, now prepared for the Joint Committee by the 
Council; and we have just completed the technical work upon a supple- 
ment which interprets the sources from which the indicators are 
derived. Our “Blue Book” on Statistical Services of the United States 
Government has had wide national and international circulation. Our 
Federal Statistical Directory and our monthly Statistical Reporter have 
been useful instruments of statistical integration within the Govern- 
ment. Not least important in this list are the facilities we offer for con- 
sultation upon the general interest by the various Federal statistical 
agencies, coming together at our invitation upon neutral ground. 

2. Certain consequential problems are involved in the procedure of 
developing a statistical budget. The separate items it brings together 
must also find a place in the estimates of expenditure submitted and 
supported by the respective departments and agencies which will ad- 
minister the funds. Hence the statistical items must be adjusted to 
other items within the departmental budgets concerned. Occasions have 
arisen in which the Bureau of the Budget has differed with the head 
of a department or agency concerning priorities among the statistical 
and other items in his budget. There have been cases in which funds 
approved by the Bureau for statistical purposes have been diverted 
by administrators to other purposes which they deemed more im- 
portant. Their actions could not easily be challenged without violating 
the sound principle that the responsibilities of an administrator should 
be accompanied by command of the resources given him. 

3. Almost inseparable from the problem of defending the general 
interest is that of securing balance among particular agency and other 
interests of which a coordinated statistical program must take account. 
Each Federal agency wishes to collect data for which it feels an ad- 
ministrative need or for which there is a legal or public demand im- 
posed upon it. However, through processes of coordination, a single in- 
quiry may often be made to serve additional purposes. The data pro- 
duced can sometimes supply the essential needs of other agencies as 
well. The original agency is not necessarily made happy by this pros- 
pect. It is exposed to the danger that its own purposes may be inade- 
quately or belatedly advanced. 

Dangers to the “other agencies” are also present. The relationships 
established when one agency renders service to another may introduce 
seeming exceptions to the principle that responsibility should be linked 
with command. The relationships acquire a contractual character, 
especially if a regular flow of data from the servicing agency to its 
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“customer” agency is desired. In scheduling production the administra- 
tor of the first must resist the frequent temptation to subordinate the 
interests of the second. There is perhaps no greater impediment to 
statistical integration than the fear that if important statistical work 
is contracted out to another organization it will lose priority. 

The attainment of optimum balance in an omnibus program, giving 
appropriate weight to each separate interest and to the general welfare, 
is one of the most delicate problems of statistical coordination. 

4. Another difficult problem is that of establishing demarcation 
lines between governmental and nongovernmental responsibilities for 
the collection of statistics. The governing principle is clearly that data 
should be collected at the expense of government only when vested 
with public interest. This formula shares the simplicity of the well 
known secret of money making in the stock market—buying low and 
selling high. The difficulties appear in the application of the principle. 
Few statistical series on any subject are without some public im- 
portance. Even a strictly private interest, if shared by a sufficient 
number of people, becomes of public interest through its implications 
for the economy. Aids to farmers in marketing their crops provide 
examples. To what figure would the number of beneficiaries have to 
be reduced for the “public” interest in statistical estimates of crop 
production to become strictly “private”? Ten thousand? One thousand? 
One hundred? Ten? Or the single producer? 

In historical fact, conceptions of what may or may not be appropriate 
undertakings by government are under constant revision. I have tried 
for many years to find some magic formula by which to apply the 
criterion of public interest to governmental statistical activities. I 
conclude that there is no general criterion and that the conception 
must be applied to individual situations as they arise, instance by in- 
stance. 

5. The extent to which the protection of “confidentiality” should 
be thrown around industrial or company data supplied to an agency 
of government raises almost equally difficult problems of discrimina- 
tion. These evoke the emotions of respondents; they provoke disputa- 
tion between exponents of monistic and pluralistic conceptions of gov- 
ernment; and they produce headaches for a statistical coordinating 
agency. 

Perhaps I am guilty of some bias in favor of the pluralistic concep- 
tion when I say that once again the governing principle is clear. This 
is that data supplied to an agency of government for statistical pur- 
poses should not be allowed, through disclosure, to cause individual 





444 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954 


hardship or disadvantage. It should not be used to support legal action 
against the respondent in the courts. It should not fall into the hands 
of business competitors who would find therein a competitive advan- 
tage. 

By becoming an unreasoned fetish this wholesome principle has 
actually worked in the past to considerable disadvantage to both 
government and suppliers of statistical information. For many years 
two Federal statistical agencies, each legally empowered to collect 
the same information from the same respondents, believed themselves 
unable to share this information with each other and thus avoid the 
necessity of duplicating their inquiries of the public. 

This situation was mitigated by certain of the provisions of the 
Federal Reports Act of 1942; yet the “fetish” remains as an impediment 
to many practical steps that would otherwise assist in the integration 
of Federal statistical activities. In a recent instance agency A sought 
to avoid the dilemma by asking its business respondents to supply in- 
formation previously given to agency B, in accordance with specifica- 
tions copied by A from those of B. Still more recently it was demon- 
strated that some millions of dollars might be saved in conducting 
the (ill-fated) Census of Business for 1953 by utilizing the information 
reported on income tax returns by retailers having no employes. The 
procedures proposed would have provided full protection against dis- 
closure to competitors or to Federal agencies other than the two 
directly concerned. When the proposal was considered by an advisory 
group of business consultants it was initially viewed with a skepticism 
approaching horror. Gradually the motive of economy prevailed and 
its adoption was recommended. 

Encroachments by governments upon the liberties of individuals 
pose one of the great politico-ethical problems of our day. Federal 
statisticians and respondents alike are handicapped by the absence 
of a clear understanding and agreement upon the limits of “confi- 
dentiality” that should be attached to statistical returns. 

6. Analogous and sometimes related questions are raised by the 
needs of government to withhold from its own citizens certain statisti- 
cal data which, if they reached potential enemies, would give aid and 
comfort to the latter. Since these questions are not inherent in the 
processes of statistical coordination per se, I shall leave them aside 
with a single footnote reference.® 

7. Lastly I mention the need for liaison between Federal statistical 





3 Stuart A. Rice and Joseph W. Kappel, “Strategic Intelligence and the Publication of Statistica,” 
The American Political Science Review, XLV (1951). 
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agencies and the statistical profession. How can we in the Federal 
Government best consult with our nongovernmental colleagues? How 
can we obtain their advice upon the issues we face—advice which is at 
the same time technically competent, and fully informed respecting 
the setting in which the issues arise? The last condition is essential if 
advice given us is to be realistic. If the condition is met, the prepara- 
tions for advice-giving must of necessity be very time-consurning, 
both for advisors and advisees. 

In the lucid and “sobering” analysis of the responsibilities that have 
been placed upon Federal statistics in connection with the nation’s 
practical affairs, presented in her notable Presidential address at 
Chicago in December, 1952, Mrs. Wickens‘ grappled courageously 
with this thorny problem. She felt that the time had come “for the 
profession as a whole to share some responsibility for these statistics 
with those who make them.” She therefore proposed “that there be 
created a new United States Statistical Commission, with responsibil- 
ity for audit of statistical series, similar to an accounting audit, em- 
powered to put a ‘certified’ label on a statistical product. It should 
also be charged with investigation of methods, scope, and suitability 
of statistics, and with making recommendations for future improve- 
ments and developmental work. .. . Primarily, its membership would 
be drawn from experts outside government. ... It should be a con- 
tinuing body, serving on occasion as required, but with a small full- 
time staff, and adequate financing, so that our most distinguished 
statisticians, economists, scientists, and other specialists could reason- 
ably be expected to devote time and attention to its work... .” 

Mrs. Wickens’ proposal excited admiration for its boldness and 
breadth. It remains my opinion that the objectives which she visualized 
can be realized in the present only piecemeal and on a much more 
modest scale. How would the Commission, as a “continuing body,” be 
financed? If through Federal appropriations, how long could it escape 
the constricting influences that surround other Federal agencies? How 
would it divide its time between its technical functions and the annual 
and prolonged struggles over appropriations, over personnel appoint- 
ments and security clearances? Presumably such an organization would 
have to operate under the full and specific authorities possessed by the 
Office of Statistical Standards; but, how would it become related to 
such other existing Federal agencies with central functions and legal 
authorities as the Council of Economic Advisers? 





4 Aryness Joy Wickens, “Statistics and the Public Interest,” Journal of the American Statistical 
Association, 48 (1953), pp. 1-14. 
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Most difficult of all, where would it find the distinguished members 
who could devote the requisite time to the labors that are visualized? 
Whatever the competence of the individuals who compose them, high 
level committees and commissions need continuity of membership and 
long exposure to the problems upon which they are asked to advise. 
The inevitably complex problems of government statistical organiza- 
tion cannot be quickly grasped. The Commission would demand mem- 
bers having a high degree of competence and general experience, but 
such people are invariably very busy. 

I can vigorously applaud Mrs. Wickens’ references to the “first 
steps” made by our profession “in these directions.” I feel great satis- 
faction in the services given to the Office of Statistical Standards by the 
Advisory Committee on Statistical Policy, created by the American 
Statistical Association at our request in 1951 and composed of mem- 
bers appointed from a distinguished panel of the Association’s present 
and past presidents and president-elect. Methodically, and without 
undue haste, the Committee has thought its way through the in- 
tricacies of a number of perplexing policy issues regarding which the 
Federal statistical services are entitled to look for leadership to the 
Office of Statistical Standards. These have involved some of the prob- 
lems that I have discussed above, including that pertaining to the 
confidentiality of individual returns and differentiation between the 
appropriate areas of governmental and nongovernmental agencies in 
the collection of data. 

The Committee is slowly approaching some of the other tasks which 
Mrs. Wickens proposed for the new Statistical Commission, such as 
“making recommendations for future improvements and develop- 
mental work.” I cannot foresee the arrival of a time in its work when it 
could itself undertake “responsibility for audit of statistical series” 
or detailed investigations of “methods, scope and suitability” of Federal 
statistics. 

We should like to make it possible, through suitable compensation 
for the services it renders, for the Advisory Committee on Statistical 
Policy to render even greater service in the future than it has in the 
past. We do believe that it should avoid becoming entangled in small 
issues and that it should be able to sift out for its attention policy 
questions of the highest priority; for we do not want a “captive com- 
mittee” that can be presented at regular intervals with a long list of 
activities for its approval. 

Other types of advice are needed and received by the Office of 
Statistical Standards from other sources. Intra-governmental com- 


COORDINATING THE UNITED STATES STATISTICAL SYSTEM 447 


mittees and conferences of representatives of Federal statistical 
agencies are meeting daily upon specific questions of coordination. The 
Advisory Council on Federal Reports, representing the world of in- 
dustry and business, operates with its own budget and secretariat, 
bringing to bear upon us the viewpoints and interests of respondents. 
upon the procedural problems which arise in the Federal collection of 
data. Similarly, the Labor Advisory Committee on Statistics presents 
to us the needs for Government figures of an important segment of 
the statistics-consuming public. 

Mrs. Wickens closed her memorable address with the expression of 
belief that “As a profession, statisticians must organize to meet this 
challenge if statistics are to continue to be administered in the public 
interest.” Despite many set-backs and discouragements, and in ways 
less pretentious than by the specific device she proposed, I believe they 
are doing so. 





GROWTH BY MERGER* 


G. Warren NutrTER 
Yale University 


IKE the business cycle in the recent past, industrial organization is 
L a topic more talked about than investigated. Every new empirical 
study will be devoured by the fact-hungry specialist in this field, in the 
hope that it will answer some of the long list of unsettled questions. 
Professor Weston’s recent book on mergers will surely be avidly read, 
but it will not relieve the hunger as much as one would hope. 

The book deals with many things, although about half of it is con- 
cerned with the main topic, namely, the role mergers have played in 
the absolute and relative growth of large firms. The remaining half dis- 
cusses the particulars of the most recent merger movement, the theory 
of mergers, and problems of economic policy. This review will concen- 
trate on the main topic: it is of greater interest to the general economist 
than the others and it lies less completely outside my areas of com- 
petence. The remainder of the book is certainly worthy of attention; 
it is slighted here because I find it less controversial and should have 
little to contribute on it in any event, in view of my bare nodding 
acquaintance with the history of mergers. The chapter on mergers of 
the 1940’s is a useful and interesting summary of the more reliable 
studies of this movement. The chapter on the theory of mergers, though 
it contains some doubtful conclusions on the motives behind them, 
sheds new light on factors explaining the timing of merger movements. 
The chapter on economic policy adds few original suggestions but sum- 
marizes the problems rather well. I leave it to experts on the history of 
mergers to examine the details of these chapters. 


THE CENTRAL ISSUE 


As I read the history of economic controversy, the central issue about 
mergers is whether industrial concentration and corporate giantism are 
in any significant degree traceable to them. There are two other sub- 
sidiary issues: whether concentration and bigness are the usual results 
of mergers; and whether they are the primary goals. These are all im- 
portant issues, but they must be recognized as distinct. Though the 
point will not be developed further, it can be noted that Weston does 
not always avoid mixing them up. 





* A review article on J. Fred Weston, The Role of Mergers in the Growth of Large Firms (Berkeley 
and Los Angeles: University of California Press, 1953). Pp. xvi, 159. $3.50. 
1 See particularly the discussion in his book on pp. 34-37 and 51-57. 
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The first issue is of importance because an answer to it may cast light 
on the bases of monopoly (or, to use a more widely accepted word, 
oligopoly) and big business. Theory tells us that a firm may achieve a 
dominant position in an industry for any of three broad reasons: (a) it 
may have genuine economies of scale relative to the market it operates 
in; (b) it may have “artificial” economies, such as patents; or (c) it may 
be able to fare better than average over the long haul through tempo- 
rary exploitations of its dominant position. It is virtually impossible to 
discover the comparative importance of these factors by direct search- 
ing for causes; we cannot, for instance, measure economies of scale. 
Hence we must rely on inferences from other, more indirect, evidence. 
One relevant matter is the way in which firms have been put together. 
It is at this point that the question of mergers enters. If a dominant 
firm has chosen to achieve a significant part of its growth through 
mergers, doubt is cast on the importance of economies of scale, cer- 
tainly as they might derive from plant operations. The greater the ex- 
tent to which dominance has been maintained through mergers, the less 
likely that economies of scale are a general basis of monopoly (or oligop- 
oly), particularly if the typical picture is one of a continual struggle to 
retain dominance in the face of continual encroachment by other firms. 

The link between analysis of mergers and bigness is similar. Without, 
of course, exhausting all possible reasons for corporate giants, one may 
suppose that the most important are advantages of monopolistic posi- 
tion and economies of producing multiple products. Here again the 
method and nature of growth helps us to discover reasons, if only by 
process of elimination. 

It is hard to know if Weston shares this view of underlying issues. 
In the opening sentences of his book he does suggest that interest in 
mergers arises from concern over broader questions, but he does not get 
down to details. He formulates the empirical problem in very general 
terms, apparently in order to make the findings applicable to a wide 
range of specific issues. “The sources, extent, and consequences of big- 
ness and concentration have been widely disputed,” he says, and “many 
questions remain unsettled.” He goes on to list some of them and con- 
cludes that, “although the formation of appropriate public policies 
awaits further study of these and related issues, our understanding of 
the nature and appropriate role of large firms may be increased by more 
complete information concerning the process of their growth” [7, p. 1]. 

There is little to dispute here, if we substitute “relevant” for “com- 
plete,” find out what is relevant, and frame the empirical problem ac- 
cordingly. Instead of using this approach, however, Weston looks di- 








450 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954 


rectly to economic literature and concludes that “mergers are often 
cited as the major source of economic concentration” [7, p. 2]. Hence he 
sets this up as the empirical question to be resolved. 

Some economists have undoubtedly made statements like this,? but 
it is doubtful that such statements reflect the heart of the controversy 
over the role of mergers. In any event, Weston’s framing of the empiri- 
cal question is unfortunate; for, as his later analysis reveals, he is trying 
to find out whether mergers account for more or less than half of the 
growth of “oligopolistic” firms, a matter of limited relevance to the 
underlying issues as I see them. The real question is whether the pattern 
of concentration and bigness as we now observe it would have emerged 
in the absence of mergers; that is, whether mergers have accounted for 
a significant portion of growth. There would seem to be no reason for 
choosing the fraction one-half as the dividing line between significance 
and insignificance. Yet he does so, explicitly as well as implicitly. For 
example, he characterizes mergers as a negligible factor in growth when 
he finds that they have accounted on the average for a fourth to a third 
of the absolute growth of firms studied [7, p. 30]. 

It would do Weston’s work an injustice to say that his data bear in 
no way on central issues; on the contrary, there is much here with direct 
bearing, available for the first time. I lament that there is not more, for 
there could have been with a slight change of orientation. Weston has 
collected certain types of data. He has used only a portion of these, 
summarized in ways best suited to his objectives. He has not given 
much attention to the problem of ordering the basic data in a flexible 
manner, a defect made more acute by failure to publish many of them. 
All of which is to say that it is very difficult to make much additional 
analysis of the data presented. On the other hand, something can be 
said about Weston’s findings, which is the task we turn to next. 


ABSOLUTE GROWTH OF FIRMS 


Weston selects 22 census industries with highly concentrated output 
in 1935 and studies the growth of 74 of their dominant firms, beginning 
in the earliest year for which data could be found in each case and end- 





2 But not in all of the places Weston cites. For instance, Corwin Edwards does not say anything 
about the quantitative importance of mergers [1, p. 142]; Stigler says something different [4, p. 23], 
as pointed out below in this review; and the statement from a government report quoted by Weston 
says that mergers are “probably more important than any other single factor” [7, p. 2], again not the 
same as Weston’s statement. 

3 He explains the basis for selecting industries [7, pp. 112-21] but not firms, other than to say that 
the latter are dominant in the selected industries. Important firms are omitted from several industries, 
e.g., electrical machinery, ammunition, ink, sewing machines, and cement. One is led to suppose that 
ease in obtaining data played a part in selection of the sample. 
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ing in 1948. Growth is measured by accretion in value of total assets.‘ 
Growth by merger, or “external” growth, is defined as the accretion 
resulting directly from all types of acquisitions of already existing 
firms; the residual is viewed as “internal” growth, that is, as growth by 
means other than merger. The relative importance of external growth 
is measured by the fraction of total growth accounted for by mergers, 
which we shall call “proportional growth by merger.” Because of con- 
ceptual and empirical difficulties in tracing external growth to its be- 
ginnings, Weston develops three measures, differing from each other 
in their treatments of assets in the earliest year for which he could find 
data. In the first measure the initial assets are regarded as external 
growth, that is, as resulting entirely from mergers; in the second they 
are excluded from both external and total growth, only subsequent 
growth being measured; in the third they are excluded from both ex- 
ternal and internal growth but not from total growth—that is, they are 
considered a separate component of growth. The first measure gives the 
highest estimate of proportional growth by merger, the third gives the 
lowest, and the second gives an intermediate estimate. 

Weston shows no preference for one measure over the other, and one 
must admit that it is difficult to choose among them. The first would 
be clearly the best if the initial assets of the largest firm taking part in 
the earliest merger were not counted as external growth, though there 
are conceptual problems here, too. As the measures stand, one can say 
that the second and third, taken by themselves, are more misleading 
than the first, because their understatement magnifies a strong down- 
ward bias that is present for another reason, discussed below. It is my 
guess that this bias overwhelms all others; therefore, at least for judging 
the over-all importance of mergers, the first measure would seem to be 
least bad, on the grounds that it probably minimizes a general down- 
ward bias. 

Like all empirical workers, Weston was faced with a host of measure- 
ment problems, none having a wholly satisfactory solution. He dis- 
cusses most of these quite adequately. For instance, he devotes an ap- 
pendix to the question whether total assets are a better index of size 
than any component of assets. He divides the basic data on firms into 
three groups, on the basis of the reliability that he attributes to them. 





4 Growth of firms in the steel industry is also sometimes measured in terms of tons of ingot steel 
capacity [7, pp. 22-23 and 132-34]. The reason for this measure, and for selected use of it (not always 
called to one’s attention), is not explained. 

§ Weston does something similar to this in an isolated appendix table [7, pp. 151-52], which plays 
no part in his analysis. He does not explain why the procedure was followed here or why it was not 
adopted as a general practice. 
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According to his judgments, data are “accurate” (were fully confirmed 
by companies) for firms accounting for about a third of aggregate as- 
sets in 1948; “dependable” (partly confirmed) for firms accounting for 
about a half; and “questionable” (unconfirmed) for firms accounting 
for about a sixth. In general, he segregates findings for each group so 
that some allowance may be made for the relative shortcomings of 
basic information. Finally, he points out that his measures of external 
growth cover only that part of growth directly attributable to mergers; 
they do not take account of indirect effects. He argues that any effort 
to summarize both direct and indirect effects would lead to unanswer- 
able questions about how growth would have gone in the absence of 
mergers. In some cases it might have gone worse, in others better. This 
view seems reasonable enough in terms of strict logic, and examples can 
be found of both effects. Nevertheless, one wonders whether there is not 
in general a presumptive advantage in mergers, at least as far as rela- 
tive growth is concerned. Otherwise, why do firms continue to engage 
in them?s 

In view of the attention paid to some of these problems, it is strange 
to find no mention of one that probably overweighs all others: how to 
deflate the value of assets to allow for changing price levels. Even if we 
waive all other troublesome problems connected with measuring growth 
of capital (which is in itself hardly permissible), we are not justified in 
taking a dollar’s worth of assets as representing the same real value in, 
say, both 1900 and 1948; some adjustment must be made for differ- 
ences in price levels. By Weston’s procedures, total growth is repre- 
sented by the value of assets recorded in 1948.7? Thus it is measured in 
terms of recent prices. External growth, on the other hand, is repre- 
sented by assets valued at the times mergers occurred. Since there has 
been a secular rise in relevant price levels, proportional growth by 
merger has been understated in this respect. Moreover, the understate- 
ment is probably of considerable magnitude; for, in spite of the under- 
statement, Weston’s data on all firms as a group show that a large share 
of growth by merger occurred in early years: 8 per cent before 1911, 
21 per cent before 1921, 71 per cent before 1931, and 89 per cent before 
1941 [7, pp. 155-56]. Unfortunately, the deflation problem is formi- 
dable.* There is simply no easy way to determine the degree of under- 





* Weston suggests other reasons for mergers, applying primarily to recent ones [7, pp. 70-75]. 
But they seem to boil down toa list of ways in which mergers have an advantage over internal expansion. 

? This is reduced in the second of his measures by the value of initial assets. 

* This problem might have been partly avoided, or at least some notion of the likely bias might 
have been gotten, by measuring growth in terms of employment. The suggestion is made, however, 
with no knowledge of the empirical difficulties involved. 
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statement, particularly on the basis of the data presented. The best we 
can do is to keep this qualification in mind when examining Weston’s 
findings. 

For the firms as a group, external growth accounted for 33, 22, and 
19 per cent of total growth, under the three measures used. The highest 
fraction occurs of course when initial assets are counted as external 
growth. For each measure the fraction is significantly higher for firms 
with “questionable” data than for other firms.® It is difficult to know 
how to interpret this, however, since the direction of error in the former 
is unknown. There appears to be no general relation between propor- 
tional growth by merger and size of firm; unweighted and weighted 
means are close together. At the same time, there is a wide dispersion 
among firms. For instance, when initial assets are counted as growth by 
merger, proportional growth by merger ranges from less than 3 per cent 
(Reynolds Tobacco) to almost 85 per cent (B. F. Goodrich). The me- 
dian is about 30 per cent. 

The same dispersion is apparent among industries. When the 74 
firms are squeezed into 22 industrial categories, proportional growth by 
merger, expressed as the weighted mean for firms in each industry, 
ranges from 8 per cent for aluminum to almost 70 per cent for cement, 
ammunition, and steel.!° The median is about 36 per cent. The data by 
industry are summarized in the table below. It might be noted that 
Weston does not give the weighted means in his tabular presentation 
[7, p. 22], and in some cases these differ rather markedly from un- 
weighted means (for instance, asphalted-felt floor coverings, photo- 
graphic apparatus, and dairy products). He also uses a figure for steel 
derived from growth of ingot steel capacity, a figure that is much lower 
than the one derived from growth of total assets. 

It must be repeated that all figures are reduced if Weston’s other two 
measures are used, some quite drastically. For instance, if initial assets 
are excluded from consideration, ammunition tumbles from the top of 
the list of industries (70 per cent) down to the bottom (2 per cent)." 

What are we to conclude from this evidence? Let us see first what 
Weston concludes. “It appears,” he says immediately following the 
presentation of evidence, “that as a group, and irrespective of measure- 





* Weston considers the differences as “not great.” Whether great or not, they must be taken as 
significant, running as follows: 41 per cent as compared with 36 per cent (for firms with “dependable” 
data) and 26 per cent (for firms with “accurate” data); 32 as compared with 23 and 18; and 27 as com- 
pared with 19 and 16. 

1° The figures for cement and ammunition are probably not very meaningful. Only two cement 
companies are covered, one (Lone Star) accounting for 7 per cent of output in 1945, the other (Ideal) 
for 3 per cent [7, p. 40]. Only one ammunition company (Remington Arms) is covered. 

1 But see n. 10 above. 
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ment assumptions, the firms studied achieved the major extent of their 
growth through internal development. The proportion of growth 
through external acquisitions, however, is appreciable” [7, p. 15]. This 
at one point; but later he says something quite different. In his sum- 


PROPORTIONAL GROWTH BY MERGER IN 22 INDUSTRIES* 








Unweighted Weighted —_— 
Industry Mean? Mean® (%) 
(%) (%) “ 





Ammunition? 70 70 
Cement* 69 69 
Steel 64! 67 
Compressed and liquefied gases 52 52 
Asphalted-felt floor coverings 51 42 
Typewriters and parts 51 54 
Photographic apparatus® 49 28 
Dairy products 48 62 
Corn syrup, sugar, oil 44 49 
Tin cans® 41 39 
Liquors 40 39 
Rubber tires 38 33 
Meat packing 30 33 
Petroleum refining 28 27 
Cigarettes 25 21 
Inke 22 23 
Rayon and allied products 22 31 
Electrical machinery® 21 20 
Motor vehicles 21 18 
Sewing machines? 17 ir 
Agricultural implements 14 19 
Aluminum 10 8 





* Initial assets counted as growth by merger. 

> Arithmetic mean of percentages for firms in each industry. Source: [7, p. 22]. 

° For all firms in each industry as a group, aggregate growth by merger as a percentage of total 
growth. Source: [7, appendix E]. 

4 Only one firm covered. 

© Only two firms covered. 

f Weston gives 53 per cent, a figure based on growth of ingot steel capacity expressed in tons. 


ming up at the end of the chapter on absolute growth he states: “Ac- 
quisitions have been a negligible portion of the total growth of most of 
the firms in census industries now characterized by a high degree of 
concentration in output... . The direct effect of mergers on the absolute 
size of large firms appears to have been small” [7, p. 30]. And in his 
final summing up: “The extent to which individual firms have grown 
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by acquisitions varies greatly, but external growth is a relatively minor 
fraction of the total growth of most of the firms” [7, p. 101]. There is a 
rather wide gulf between “appreciable,” on the one hand, and “negligi- 
ble,” “small,” and “relatively minor,” on the other. There is little doubt 
from the general tone of his argument and widely scattered references 
that Weston really considers mergers a negligible source of growth. Yet 
his temporary wavering in the other direction is significant; in fact, it is 
the key for understanding why Weston adopts his final conclusion. 

The point is simply this: As was said earlier, the question Weston 
sets out to examine is whether mergers have been “the major source of 
economic concentration.” If “the major source” is understood to mean 
“responsible for substantially more than half,” fractions below a half 
become minor, or small or negligible. It is now clear that Weston does 
conceive of the issue in these terms. Otherwise, why his conclusions? 
It is difficult to see any other grounds for calling a fraction of 33 per 
cent (or 22 or 19 per cent) negligible. 

The impressive thing, to me at least, is the height of the fractions not 
their depth—particularly when account is taken of their probable 
downward bias and of the vast growth of the economy over the last half 
century, which leads us to expect internal growth to swamp growth by 
merger. Growth by merger has certainly been important enough, in a 
sizable group of the industries studied, to cast serious doubt on some of 
the explanations advanced for industrial concentration and abnormally 
large size. 

Whatever conclusions are drawn, they should be considered tenta- 
tive. Although this study makes a useful contribution, much remains 
to be done. It would be useful, for instance, to know the role played by 
mergers in industries that were highly concentrated around the turn of 
the century but are no longer; and in industries with continually low 
concentration. It would also be useful to study the changing importance 
of mergers over time. Weston unfortunately provides little information 
on this matter; the data presented by him are limited to broad sweeps 
of time ending in 1948, not broken down into subintervals. Finally, 
learning about the role of mergers in absolute growth only starts us on 
the way to learning about the role in relative growth. This leads us into 
the next topic discussed by Weston. 


RELATIVE GROWTH OF FIRMS 


Weston opens the discussion of relative growth by characterizing the 
trend in industrial concentration as a movement from partial monopoly 
around the turn of the century to oligopoly in recent times. The period 
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of partial monopoly (dominance of an industry by a single firm) is in 
turn described as a temporary diversion from a historically typical pat- 
tern of oligopoly (multiple dominance), as far as concentrated indus- 
tries are concerned. That diversion he attributes almost entirely to the 
merger movement around the turn of the century. That is, growth by 
merger is in his view the primary explanation for increased concentra- 
tion in specific industries in the 1890’s and early 1900’s. Later develop- 
ments, he asserts, are a different matter: “acquisitions subsequent to 
the early merger movement have had relatively small effects on con- 
centration” [7, p. 48]. 

Before we proceed to the core of his argument, a digressive comment 
on the early merger movement should be made. Weston’s description 
of the result as a general trend from multiple to single dominance of 
industries implies that market areas remained essentially fixed during 
the latter part of the nineteenth century. This was, however, a period in 
which the truly national market was emerging; localized markets were 
much more prevalent in the years leading up to the mergers than there- 
after. It is therefore not unreasonable to view mergers as a force coun- 
teracting expansion of markets, and hence as leading to less change in 
the structure of dominance than is usually believed. This point is raised 
not to refute Weston’s argument, but rather to shift the line of argu- 
ment in a rather obvious way. The point is this: The early merger 
movement is important in the history of industrial concentration be- 
cause it made concentration, taken in a relevant sense, greater than it 
otherwise would have been, irrespective of whether it actually in- 
creased concentration or not. 

Now, when we come to developments after 1904, the primary issue 
must be put in a similar way: Have mergers played an important role 
in making the pattern of concentration significantly different from what 
it otherwise would have been? Weston raises this question, but only 
after he is far along in his discussion; and then it is put as more or less 
subordinate to two other, narrower, issues. First, he wants to know 
whether the trend from single to multiple dominance can be attributed 
to a change in the nature and motivation of mergers. This question is 
essentially part of a running argument with Professor Stigler, on which 
comment will be deferred until later. Secondly, he wants to know 
whether mergers since 1904 have been accompanied by increased con- 
centration. He concludes that they have not. It is in qualification of 
this conclusion that he raises the central issue, namely, “decreases in 


concentration might have been even larger in the absence of mergers” 
(7, p. 44]. 
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In order to understand Weston’s analysis of the central issue, we are 
almost forced to follow his rather wandering path to it. As he sets out 
to trace the relation between mergers and trends in industrial concen- 
tration, he is immediately confronted with the frustrating lack of re- 
liable historical data on industrial concentration. Every one who has 
struggled with this problem will feel sympathy for Weston. Concen- 
tration ratios can be compiled from the Census of Manufacturers from 
1929 to date; many already have been. However, ratios for individual 
firms are not available even here, since the data are grouped for no 
fewer than the four largest firms in each industry. Moreover, ratios for 
different years are not strictly comparable because of different defini- 
tions of industries. A fairly large fund of information can be gathered 
for the turn of the century, much of it broken down by individual 
firms ;” but its reliability is doubtful, to say the least. The investigator 
is left to his own devices in constructing time series; he must search 
trade journals, isolated monographs, and so on. The difficulties can 
scarcely be exaggerated. 

Weston pulls together estimates of each leading firm’s share of out- 
put in 9 of his original 22 industries, covering selected years over the 
last half century. The estimates for 5 industries (motor vehicles, steel, 
cigarettes, aluminum, and cement) are as reliable as one can expect, 
though they could be more complete. On the other hand, the estimates 
for the remaining 4 industries (electrical machinery, meat packing, 
rubber tires, and tin cans) are built on a very shaky foundation, and it 
doubtful that much significance can be attached to them. In these lat- 
ter cases, each firm’s share of output is taken to be the same as the ratio 
of its sales of all products to the value of products for the census 
industry in which that firm can be classified. The size of probable 
errors under this procedure is so large as to destroy the significance of 
all but very large differences in shares at different dates. It must be 
granted that census value of products for an industry, when computed 
on an establishment basis, includes the value of some products not 
classified in the industry; but it would be highly unlikely that errors 
here and in sales data for firms would be compensating, among firms 
and between firm and industry. In a footnote to his table‘ Weston rec- 
ognizes that the data “are not strictly comparable,” but he believes 





12 See, e.g., [2, pp. 129-40]. 

18 Some indication of possible errors is provided by comparing the combined share of the four 
leading firms for around 1935 as derived from Weston’s estimates, in some cases by interpolation, with 
the combined share for 1935 computed from census data. For meat packing, the two are 30 and 50 per 
cent, respectively; for rubber tires, 73 and 81 per cent; for tin cans, 87 (two firms), and 80 (four firms) 
per cent; and for electrical machinery, 37 (two firms) and 44 (four firms) per cent [7, pp. 40-41 and 
116}. 
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that “the percentages through time indicate very roughly trends in 
occupancy of the market” [7, p. 40]. Even this limited conclusion is 
open to serious doubt. Moreover, in his analysis he treats changes in 
shares of output, measured in this way, as having much more accuracy 
than would be possessed by rough indicators of trends. 

Weston’s first step is to examine developments in each of these 9 
industries. Let us focus on those for which data on concentration can 
be considered reliable. In the case of motor vehicles, he notes a rather 
steady secular rise, with temporary ups and downs, in the dominance 
of General Motors, being achieved since 1921 largely at the expense of 
Ford’s share of the market. The gains of General Motors are, he says, 
not to be attributed to mergers because “the emergence of General 
Motors Corp. as a leader in the industry came many years after the 
consolidating operations from 1911 to 1920 under W. C. Durant” [7, 
p. 36]. This is a strange conclusion in several respects. First, Durant 
was in and out of control over General Motors in the period from 1908 
through 1920; from 1910 to 1915, while out, he built up the Chevrolet 
Motor Company into a threatening rival [6, pp. 419-429]. After the 
two were merged in 1915, through financial manipulations by Durant, 
General Motors’ share of the market rose substantially [6, p. 27]. Sec- 
ond, the spectacular rise of General Motors occurred in the immedi- 
ately following decade of the twenties, after the serious financial prob- 
lems inherited from Durant’s regime had been solved [6, p. 27]. This 
development cannot be read from the information Weston presents, 
because he does not give data for the period between 1921 and 1937. 
Third, merger activity was by no means stopped after 1920 [6, pp. 
428-29]. 

Another important development hidden by Weston’s presentation of 
data is the rapid rise of Chrysler in the late twenties and throughout 
the thirties, a rise attributable in large measure to mergers. Chrysler’s 
share of the market rose steadily (except for around 1935) from 3 per 
cent in 1925 to 23 per cent in 1937 [6, p. 27]. 

For the steel industry he notes a steady decline in the combined 
share of the four leading producers from 1901 through 1920, an in- 
crease through 1930, and a slight decline thereafter. The rise in the 
twenties he attributes to growth by merger. He says that “since 1930, 
however, despite continued merger activities, market occupancy of the 
largest four has decreased slightly” [7, p. 38]. This statement is mis- 
leading for two reasons. First, the decline in the combined share of the 
four (or five) largest firms is barely perceptible, running at 0.2 of a per- 
centage point, well within the range of computational error alone. Sec- 





GROWTH BY MERGER 459 


ond, the shares of Republic and Bethlehem, taken individually, in- 
creased by a significant amount; and they were the firms with greatest 
proportional growth by merger over this period. Their combined in- 
crease was matched by the decline for U. 8S. Steel, which had virtually 
no merger activity in this period. 

He describes the trend in the cigarette industry as similar to that 
in steel. In this case, however, the pattern seems to be one of a general 
secular decline in the share of each of the four leaders through 1939, 
with the rise of Philip Morris in the thirties complicating the picture. 
If the data are carried through 1949 (which is not done by Weston), the 
pattern is changed to the extent that the largest producer, American 
Tobacco, regains most of the share it lost between 1912 and 1939 [5, 
p. 94]. 

The two remaining industries with reliable data on concentration are 
aluminum and cement. Weston rightly describes aluminum as a special 
case of decreasing concentration resulting from disposal of wartime- 
created capacity. We are given few data on concentration in the cement 
industry, covering only the span from 1929 through 1945. We are given 
even fewer data on absolute growth by merger.‘ Hence no conclusions 
can be drawn about this industry, and Weston does not draw any. 

According to Weston, the data for these industries, and the other 
four not discussed here, “suggest that concentration in industries has 
not generally been increased by mergers since 1904. In the majority of 
industries for which information is available, decreases in concentration 
have actually occurred since 1904 despite merger activity” [7, p. 42]. 
In a sense Weston is certainly right: the share of the largest firm in 
most industries has declined, and the combined share of the two, three, 
or four largest firms has also. But this is not really relevant; the im- 
portant question is whether the declines occurred for firms with high 
or low proportional growth by merger. The following table shows a 
rather consistent relation between declines and relatively low propor- 
tional growth by merger, and between increases and relatively high pro- 
portional growth by merger. This conclusion is not vitiated if industries 
with questionable data are included. For every industry except rubber 
tires and possibly meat packing, relatively low proportional growth by 
merger is associated with declines in share of the market. For every 
industry except possibly meat packing and cigarettes (one firm only), 
relatively high proportional growth by merger is associated with in- 
creases in share of the market. 

Weston’s conclusion—namely, that concentration has decreased 





4 See n. 10 above. 
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RELATION OF MERGERS TO CHANGES IN INDUSTRIAL 
CONCENTRATION, 1904-1948 








Proportional 
Growth by 
Merger® 
(%) 


Change in % Share of 
Share of Output, 1948> 
Output (%) 


Industry and Firm 





Motor vehicles 
Chrysler 
General Motors 
Ford 
Steel 
Republic 
Jones and Laughlin 
Bethlehem 
U. 8. Steel 
Cigarettes® 
American Tobacco 
Philip Morris 
Lorillard 
Reynolds 
Liggett and Myers 
Aluminum 
Reynolds 
Permanente 
Alcoa 
Electrical machinery 
Westinghouse +? 
General Electric —? 
Meat packing* 
Armour 0? 
Wilson 0? 
Swift -? 
Cudahy 0? 
Ribber ttres® 
Goodrich +? 
U. 8. Rubber 0? 
Goodyear +? 
Firestone +? 24? 
Tin cans* 
Continental Can +? 25? 
American Can 4 (34) —? 51? 





® Initial assets counted as growth by merger, except for firms engaging in important mergers be- 
fore 1904 [7, p. 150] and firms in the cigarette industry formed by dissolution of the Tobacco Trust in 
1911. For the latter two groups of firms, percentages including initial assets as growth by merger are 
shown in parentheses. Source: [7, appendix E]. 

> Source: (7, pp. 39-41]; for the cigarette industry, also [5, p. 94). 

° Terminal date is 1949. 

4 Terminal date is 1939, 

® Terminal date is 1947. 
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since 1904 despite mergers—leads him to a second line of inquiry. He 
raises the possibility that “decreases in concentration might have been 
even larger in the absence of mergers” [7, p. 44]. He embarks at this 
point on a statistical analysis that I am not sure I fully understand. 
My explanation must be given with the warning that it may not be an 
accurate representation of what Weston is trying to do. 

We may get at his approach by considering how information on pro- 
portional growth by merger might be married to information on output 
concentration. Let us suppose that, in the absence of mergers, the 
growth of a firm would have been smaller by exactly the amount of 
assets acquired by mergers. Let us further suppose that a firm’s output 
grows in the same percentage as its assets; that is, a doubling of assets 
leads to a doubling of output. Finally, let us suppose that the firms 
acquired by merger had retained their separate identities and had not 
grown. These are heroic assumptions, subject to all kinds of qualifica- 
tion; but they can perhaps serve as working hypotheses for deriving a 
first approximation, which will almost surely be an underestimate, of 
the amount of concentration attributable to mergers. If they are ac- 
cepted on this basis, it follows that the fraction of a firm’s share of 
output attributable to mergers is measured by the fraction of its 
growth directly accounted for by mergers. For instance, by this reason- 
ing 77 percent of Republic’s share of steel output would be attributable 
to mergers, since 77 per cent of its growth is directly accounted for by 
mergers; instead of producing 9 per cent of steel output in 1948, it 
would have produced only 2 per cent if none of the mergers had taken 
place. 

In principle the effect of mergers over any desired period could be 
estimated in this way, by measuring proportional growth by merger 
over that period alone. We cannot do this with the statistics worked up 
by Weston, however; for his measures of proportional growth by 
merger differ only in their treatments of assets in the initial years for 
which he found data, years that vary widely among firms [7, p. 11]. By 
extensive reworking of his basic data we might be able to eliminate all 
mergers before some particular date, but this would be a major job. 
Except for that possibility, the only thing that can be done is to esti- 
mate the effect of mergers over the entire life of firms. To do this one 
needs to look no further than the measures of proportional growth by 
merger, with initial assets counted as growth by merger; that is, all one 
needs is the evidence Weston has developed on the importance of 
mergers in absolute growth. Since mergers have generally accounted 
for substantial fractions of the absolute growth of the firms studied, 
it follows that they also account for substantial fractions of the firms’ 
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shares of output. Hence they have caused the pattern of concentration 
to be significantly different from what it otherwise would have been. 

Weston looks at the problem somewhat differently. He tries to relate 
quantitative changes in concentration with growth by merger, each oc- 
curring over restricted time periods; in addition, he tries to measure the 
fraction of a firm’s share of output around 1948 that is attributable to 
mergers occurring over a restricted time period. It is hard to see what 
his findings can be taken to mean. First, the period of mergers will only 
rarely coincide with the period of changes in concentration, since the 
initial date in each case is simply the earliest year for which pertinent 
data could be found, in the one case on assets, in the other on share of 
output. Second, periods will vary widely among firms. Third, the 
estimates of shares of output do not have the accuracy required by the 
analysis. Finally, his sample has dwindled to 25 firms, whose identities 
are not revealed. 

There would seem to be little reason for reviewing his findings, but 
it may be appropriate to say a few more words about his general 
method. The assumptions underlying his approach seem to be those 
outlined above, though he does not state them explicitly, and they are 
not easy to unravel from his explanation of procedures. Perhaps it is 
best to let the reader decide by having Weston speak for himself: 

Of the several possible techniques for measuring the influence of internal 
expansion on existing levels of concentration, the following appeared to be 
most useful. First, the market occupancy percentage of the leading firms in 
an industry for the earliest year for which data are available was calculated. 
Second, data on external growth were deducted from absolute amounts of 
output or total assets of individual firms, but not for the industry as a whole. 
Third, the adjusted data of output or total assets for the firms were used to 
calculate market occupany percentages of individual firms which would 
have obtained if the growth of the firms had occurred entirely by internal 
expansion. Fourth, the adjusted concentration ratios were compared with 
the concentration ratios of the earliest period to measure the extent of 
present concentration due to internal growth . . . [7, p. 44]. 

... concentration ratios which would have existed in the absence of ex- 
ternal growth subsequent to the initial year for which data could be secured 
for individual firms . . . are [next] deducted from concentration ratios exist- 
ing in 1947, to provide measures of the extent to which present concentration 
is accounted for by acquisitions which took place after the early merger 
movement. [7, p. 46.] 


This is the entire explanation. There is no further clarification of 
how time periods vary; how proportional growth by merger is meas- 
ured; how assets acquired by merger were “deducted from absolute 
amounts of output”; how data on concentration of assets were com- 
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piled, and in what cases they were used; what firms and industries are 
covered in the sample; and so on. 

One must regretfully conclude that Weston’s discussion of the role 
played by mergers in the development of recent patterns of industrial 
concentration adds little to the knowledge already implicit in his evi- 
dence on the importance of mergers in absolute growth. Some new 
light is shed on developments in three industries, but these make up 
only a tiny sample of all industries, and much was known about them 
before. He does not provide the evidence needed to back up his con- 
tention that mergers have had very little to do with changes in in- 
dustrial concentration since 1904. 


WESTON versus STIGLER 


An interesting sidelight to Weston’s book is a running argument 
with Stigler over the relation between mergers and industrial con- 
centration.“ It deserves mention because of its strong influence on the 
course of Weston’s enquiry. 

The primary source of dispute is a paper by Stigler [4]. Weston sees 
in this paper several controversial theses, which he puts as follows: 
(a) “mergers have been the major factor causing the development of 
bigness and concentration in the American economy” [7, p. 7]; (0) 
mergers around the turn of the century were motivated solely by a 
desire to achieve monopoly [7, p. 32]; (c) “mergers have been the main 
instruments by which partial monopolies have been transformed into 
oligopolies” [7, p. 36]; and (d) survival is the “test of relative efficiency 
among firms of different sizes” [7, pp. 64-65]. 

Now exegesis is a tricky business, and often a fruitless one; there is 
little to be gained here by prolonged laboring over what Stigler “really” 
said. Allow me, however, to offer some brief interpretations of Stigler’s 
views on these points, and to add a few comments on some of Weston’s 
replies. For the rest, let Stigler’s paper speak for itself. 

Stigler makes a remark in his introductory comments that is akin to 
the first thesis attributed to him. It is not repeated at any other point, 
in either the same or related form. Moreover, close examination reveals 
that the kinship is remote. This is what he says: “There are no large 
American companies that have not grown somewhat by merger, and 





4 There is another argument as well, which will not be discussed here. It deals with the importance 
of taking account of the international scope of markets in calculating concentration ratios. Stigler’s 
general position is that failure to do so makes ratios significantly higher than they should be [3, p. 7; 
also apparently in a letter to Weston]. Weston recomputes ratios for 1935 to include imports in total 
output and concludes that the ratios generally are not significantly reduced. It is difficult to comment on 
this dispute because part of it stems from an unpublished letter sent by Stigler to Weston, whose con- 
tents are alluded to by Weston. 
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probably very few that have grown much by the alternative method 
of internal expansion”; to which is added the footnote: “Unless other- 
wise indicated, size of the firm is to be measured relative to the size 
of the industry” [4, p. 23]. I take this to mean that dominant firms 
would not generally have gotten that way in the absence of mergers. 
Weston’s interpretation, which need not be repeated here, is quite 
different. 

Hard searching has not produced for me the second and third theses. 
The closest Stigler comes to them is the following statement, which is 
far away indeed: “We shall find it useful to divide this history [of 
mergers] into two periods, in which monopoly and oligopoly, respec- 
tively, were the primary goals” [4, p. 27]. 

The last thesis is really there. His full statement is as follows: 

The comparative private costs of firms of various sizes can be measured 
in only one way: by ascertaining whether firms of the various sizes are able 
to survive in the industry. Survival is the only test of a firm’s ability to cope 
with all the problems: buying inputs, soothing laborers, finding customers, 
introducing new products and techniques, coping with fluctuations, evading 
regulations, etc. A cross-sectional study of the costs of inputs per unit of 
output in a given period measures only one facet of the firm’s efficiency and 
yields no conclusion on efficiency in the large. Conversely, if a firm of a given 
size survives, we may infer that its costs are equal to those of other sizes of 
firm, being neither less (or firms of this size would grow in number relative 


to the industry) nor more (or firms of this size would decline in number 
relative to the industry) [4, p. 26]. 


Weston examines this statement in his chapter on the theory of 
mergers, which we have not discussed. He disagrees on four grounds. 
First, firms classified in the same industry, as defined for instance by 
the Census, do not all produce the same products; in particular, large 
ones typically produce many products, while small ones typically 
specialize in a single product, frequently an accessory or a custom 
item. This is certainly true, and it is a proper warning against in- 
cautious use of data. But it does not contradict Stigler’s proposition. 
It might be added parenthetically that, even by taking full advantage 
of such empirical ambiguities, one is hard-pressed to name more than 
a handful of large industries in which there are not firms of widely 
varying size producing essentially the same product for essentially the 
same markets. Those who are sceptical should try. 

Second, small firms may be kept alive because dominant firms 
spread a price “umbrella” over them; that is, the difference between 
cost and price for the dominant firm is sufficiently large to allow sur- 
vival of less efficient small firms. This point may be valid only if smaller 
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firms account for a very small portion of output. Larger firms, if more 
efficient, should gradually displace smaller ones, the point made by 
Stigler. If they do not, the implication is that the cost advantage to 
the dominant firms, if any, must be unique (in the form of an economic 
rent), not available to large firms in general. 

Third, smaller firms might be satisfied with lower rates of return on 
investment. This is possible; but so might larger firms. In any event, 
does this make any difference in the general results? The fourth point 
is essentially a repetition of this one, suggesting in addition that smaller 
firms might be satisfied to have their assets undervalued. 

The most interesting thing about this controversy is the way in 
which Weston has misread Stigler, particularly on the first three points. 
In the paper under dispute, Stigler is really trying to find out, in part 
by recourse to history, why mergers have been so often used in prefer- 
ence to internal expansion as a means of achieving dominance in an 
industry; and why they have, since about 1904, contributed more 
toward oligopoly than toward monopoly. In answering these questions 
Stigler is led to a general theoretical explanation for industrial con- 
centration, deflating the importance of economies of scale and inflating 
the importance of temporary exploitation. Here are revealed some of 
the basic issues that motivate us to seek more data on mergers. If the 
data are to be relevant, the empirical questions must also be relevant. 
Somehow, by bare margins in some cases, Weston has failed to raise the 
relevant questions. 


GROWTH BY MERGER 





CONCLUDING REMARKS 


This review, like most, has stressed a book’s vices and slighted its 
virtues. A few words are called for to help redress the imbalance. 

The book has resulted from a major research undertaking. It offers 
much information not before available, and specialists in the field of 
industrial organization will surely want to exploit it fully. They will 
also want to make use of the likely rich source of data represented by 
Weston’s worksheets, obtainable from the Bureau of Business and 
Economic Research, University of California. 

At the same time, the reader must beware of blind acceptance of 
Weston’s conclusions and some of his analysis. The facts as I see them 
do not support much of what Weston has to say, particularly about 
the influence of mergers on industrial concentration since 1904. In 
cases where they do, the conclusions sometimes have a significance 
different from what Weston supposes. 

In brief, this book breaks the ground well in an area where little 
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comprehensive statistical work had been previously done. It is not the 
final word, but it is a welcome beginning. 
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SPURIOUS CORRELATION: A CAUSAL INTERPRETATION* 


HERBERT A. Simon 
Carnegie Institute of Technology 


To test whether a correlation between two variables is 
genuine or spurious, additional variables and equations must 
be introduced, and sufficient assumptions must be made to 
identify the parameters of this wider system. If the two origi- 
nal variables are causally related in the wider system, the 
correlation is “genuine.” 


VEN in the first course in statistics, the slogan “Correlation is no 

proof of causation!” is imprinted firmly in the mind of the aspiring 
statistician or social scientist. It is possible that he leaves the course 
(and many subsequent courses) with no very clear ideas as to what is 
proved by correlation, but he never ceases to be on guard against 
“spurious” correlation, that master of imposture who is always repre- 
senting himself as “true” correlation. 

The very distinction between “true” and “spurious” correlation ap- 
pears to imply that while correlation in general may be no proof of 
causation, “true” correlation does constitute such proof. If this is what 
is intended by the adjective “true,” are there any operational means 
for distinguishing between true correlations, which do imply causation, 
and spurious correlations, which do not? 

A generation or more ago, the concept of spurious correlation was 
examined by a number of statisticians, and in particular by G. U. 
Yule [8]. More recently important contributions to our understanding 
of the phenomenon have been made by Hans Zeisel [9] and by Patricia 
L. Kendall and Paul F. Lazarsfeld [1]. Essentially, all these treatments 
deal with the three variable case—the clarification of the relation be- 
tween two variables by the introduction of a third. Generalizations 
to n variables are indicated but not examined in detail. 

Meanwhile, the main stream of statistical research has been diverted 
into somewhat different (but closely related) directions by Frisch’s 
work on confluence analysis and the subsequent exploration of the 
“identification problem” and of “structural relations” at the hands of 
Haavelmo, Hurwicz, Koopmans, Marschak, and many others.! This 
work has been carried on at a level of great generality. It has now 
reached a point where it can be used to illuminate the concept of 





* IT am indebted to Richard M. Cyert, Paul F. Lazarsfeld, Roy Radner, and T. C. Koopmans for 
valuable comments on earlier drafts of this paper. 
1 See Koopmans (2) for a survey and references to the literature. 
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spurious correlation in the three-variable case. The bridge from the 
identification problem to the problem of spurious correlation is built 
by constructing a precise and operationally meaningful definition of 
causality—or, more specifically, of causal ordering among variables in 
a model.? 


1, STATEMENT OF THE PROBLEM 


How do we ordinarily make causal inferences from data on correla- 
tions? We begin with a set of observations of a pair of variables, x and 
y. We compute the coefficient of correlation, r.,, between the variables 
and whenever this coefficient is significantly different from zero we wish 
to know what we can conclude as to the causal relation between the 
two variables. If we are suspicious that the observed correlation may 
derive from “spurious” causes, we introduce a third variable, z, that, 
we conjecture, may account for this observed correlation. We next 
compute the partial correlation, r.,.., between z and y with z “held 
constant,” and compare this with the zero order correlation, rzy. If 
zy-2 18 Close to zero, while r,, is not, we conclude that either: (a) z is an 
intervening variable—the causal effect of z on y (or vice versa) operates 
through z; or (b) the correlation between z and y results from the joint 
causal effect of z on both those variables, and hence this correlation is 
spurious. It will be noted that in case (a) we do not know whether the 
causal arrow should run from z to y or from y to z (via z in both cases) ; 
and in any event, the correlations do not tell us whether we have 
case (a) or case (b). 

The problem may be clarified by a pair of specific examples adapted 
from Zeisel.* 

I. The data consist of measurements of three variables in a number 
of groups of people: z is the percentage of members of the group that is 
married, y is the average number of pounds of candy consumed per 
month per member, z is the average age of members of the group. A 
high (negative) correlation, r.,, was observed between marital status 





? Simon (6) and (7). See also Orcutt (4) and (5). I should like, without elaborating it here, ts insert 
the caveat that the concept of causal ordering employed in this paper does not in any way solve the 
“problem of Hume” nor contradict his assertion that all we can ever observe are covariations. If we 
employ an ontological definition of cause—one based on the notion of the “necessary” connection of 
events—then correlation cannot, of course, prove causation. But neither can anything else prove 
causation, and hence we can have no basis for distinguishing “true” from “spurious” correlation. If we 
wish to retain the latter distinction (and working scientists have not shown that they are able to get 
along without it), and if at the same time we wish to remain empiricists, then the term “cause” must be 
defined in a way that does not entail objectionable ontological consequences. That is the course we 
shall pursue here. 

3 Zeisel [9], pp. 192-95. Reference to the original source will show that in this and the following 
example we have changed the variables from attributes to continuous variables for purposes of exposi- 
tion. 
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and amount of candy consumed. But there was also a high (negative) 
correlation, r,., between candy consumption and age; and a high 
(positive) correlation, r,z, between marital status and age. However, 
when age was held constant, the correlation r.,.,, between marital 
status and candy consumption was nearly zero. By our previous 
analysis, either age is an intervening variable between marital status 
and candy consumption; or the correlation between marital status 
and candy consumption is spurious, being a joint effect caused by the 
variation in age. “Common sense”—the nature of which we will want 
to examine below in detail—tells us that the latter explanation is the 
correct one. 

II. The data consist again of measurements of three variables in a 
number of groups of people: z is the percentage of female employees 
who are married, y is the average number of absences per week per 
employee, z is the average number of hours of housework performed 
per week per employee.‘ A high (positive) correlation, ry, was ob- 
served between marriage and absenteeism. However, when the amount 
of housework, z was held constant, the correlation r.zy.. was virtually 
zero. In this case, by applying again some common sense notions about 
the direction of causation, we reach the conclusion that z is an inter- 
vening variable between zx and y: that is, that marriage results in a 
higher average amount of housework performed, and this, in turn, in 
more absenteeism. 

Now what is bothersome about these two examples is that the same 
statistical evidence, so far as the coefficients of correlation are con- 
cerned, has been used to reach entirely different conclusions in the two 
cases. In the first case we concluded that the correlation between z and 
y was spurious; in the second case that there was a true causal relation- 
ship, mediated by the intervening variable z. Clearly, it was not the 
statistical evidence, but the “common sense” assumptions added after- 
wards, that permitted us to draw these distinct conclusions. 


2. CAUSAL RELATIONS 


In investigating spurious correlation we are interested in learning 
whether the relation between two variables persists or disappears when 
we introduce a third variable. Throughout this paper (as in all ordinary 
correlation analyses) we will assume that the relations in question are 
linear, and without loss of generality, that the variables are measured 
from their respective means. 





4 Zeisel [9], pp. 191-92. 
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Now suppose we have a system of three variables whose behavior is 
determined by some set of linear mechanisms. In general we will need 
three mechanisms, each represented by an equation—three equations 
to determine the three variables. One such set of mechanisms would 
be that in which each of the variables directly influenced the other 
two. That is, in one equation x would appear as the dependent variable, 
y and z as independent variables; in the second equation y would ap- 
pear as the dependent variable, x and z as the independent variables; 
in the third equation, z as dependent variable, z and y as independent 
variables.® 

The equations would look like this: 


(2.1) 2+ ayy + az = UH, 
(I) (2.2) Ant + y+ yz = te, 

(2.3) Ait + apy + 2 = Us, 
where the u’s are “error” terms that measure the net effects of all other 
variables (those not introduced explicitly) upon the system. We refer 
to A =||a;,|| as the coefficient matrix of the system. 


Next, let us suppose that not all the variables directly influence all 
the others—that some independent variables are absent from some of 


the equations. This is equivalent to saying that some of the elements of 
the coefficient matrix are zero. By way of specific example, let us as- 
sume that a3;=4a32=a2,=0. Then the equation system (I) reduces to: 
(2.4) 2+ ayy + Ay32 = UW, 
(IT) (2.5) Y + oz = U, 
(2.6) 2 = Us. 


By examining the equations (II), we see that a change in us; will 
change the value of z directly, and the values of x and y indirectly; 
a change in we will change y directly and z indirectly, but will leave z 
unchanged; a change in w; will change only x. Then we may say that y 
is causally dependent on z in (II), and that z is causally dependent on y 
and z. 

If x and y were correlated, we would say that the correlation was 
genuine in the case of the system (II), for a:.+0. Suppose, instead, that 
the system were (III): 





5 The question of how we distinguish between “dependent” and “independent” variables is dis- 
cussed in Simon (7), and will receive further attention in this paper. 
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(2.7) + Ay2 = UH, 
(III) (2.8) y + dz = th, 
(2.9) z= Us. 


In this case we would regard the correlation between xz and y as 
spurious, because it is due solely to the influence of z on the variables 
xz and y. Systems (II) and (III) are, of course, not the only possible 
cases, and we shall need to consider others later. 


3. THE @ priort ASSUMPTIONS 


We shall show that the decision that a partial correlation is or is not 
spurious (does not or does indicate a causal ordering) can in general 
only be reached if a priori assumptions are made that certain other 
causal relations do not hold among the variables. This is the meaning 
of the “common sense” assumptions mentioned earlier, Let us make 
this more precise. 

Apart from any statistical evidence, we are prepared to assert in 
the first example of Section 1 that the age of a person does not depend 
upon either his candy consumption or his marital status. Hence z can- 
not be causally dependent upon either z or y. This is a genuine empiri- 
cal assumption, since the variable “chronological age” really stands, 
in these equations, as a surrogate for physiological and sociological 
age. Nevertheless, it is an assumption that we are quite prepared to 
make on evidence apart from the statistics presented. Similarly, in 
the second example of Section 1, we are prepared to assert (on grounds 
of other empirical knowledge) that marital status is not causally de- 
pendent upon either amount of housework or absenteeism.® 

The need for such a priori assumption follows from considerations of 
elementary algebra. We have seen that whether a correlation is genuine 
or spurious depends on which of the coefficients, a;;, of A are zero, and 
which are non-zero. But these coefficients are not observable nor are 
the “error” terms, wu, u2 and us. What we observe is a sample of values 
of x, y, and z. 

Hence, from the standpoint of the problem of statistical estimation, 
we must regard the 3n sample values of z, y, and z as numbers given by 
observation, and the 3n error terms, u;, together with the six coeffi- 
cients, a;;, as variables to be estimated. But then we have (3n+6) 





6 Since these are empirical assumptions it is conceivable that they are wrong, and indeed, we can 
imagine mechanisms that would reverse the causal ordering in the second example. What is argued 
here is that these assumptions, right or wrong, are implicit in the determination of whether the cor- 
relation is true or spurious. 
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variables (3n u’s and six a’s) and only 3n equations (three for each 
sample point). Speaking roughly in “equation-counting” terms, we 
need six more equations, and we depend on the a priori assumptions 
to provide these additional relations. 

The a priori assumptions we commonly employ are of two kinds: 

(1) A priort assumptions that certain variables are not directly 
dependent on certain others. Sometimes such assumptions come from 
knowledge of the time sequence of events. That is, we make the general 
assumption about the world that if y precedes z in time, then a2,=0— 
x does not directly influence y. 

(2) A priori assumptions that the errors are uncorrelated—i.e., that 
“all other” variables influencing z are uncorrelated with “all other” 
variables influencing y, and so on. Writing E (u,u,;) for the expected 
value of u,u;, this gives us the three additional equations: 


E(uue) a 0; E(uyus) = 0; E(ugus) = 0. 


Again it must be emphasized that these assumptions are “a priori” 
only in the sense that they are not derived from the statistical data 
from which the correlations among z, y, and z are computed. The as- 
sumptions are clearly empirical. 

As a matter of fact, it is precisely because we are unwilling to make 
the analogous empirical assumptions in the two-variable case (the 
correlation between z and y alone) that the problem of spurious corre- 
lation arises at all. For consider the two-variable system: 


(Iv) (3.1) z+ buy = 
(3.2) Y = ve 
We suppose that y precedes z in time, so that we are willing to set 
bs:=0 by an assumption of type (1). Then, if we make the type (2) 
assumption that E(v.) =0, we can immediately obtain a unique esti- 
mate of by. For multiplying the two equations, and taking expected 
values, we get: 





(3.3) E(zy) + beE(y*?) = E(u) = 0. 
Whence 
E(zy) Cy 
A = - a ee- 
(3.4) b Ey’) e r, 


It follows immediately that (sampling questions aside) b,2 will be zero 
or non-zero as Tz is zero or non-zero. Hence correlation is proof of causa- 
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tion in the two-variable case if we are willing to make the assumptions of 
time precedence and non-correlation of the error terms. 

If we suspect the correlation to be spurious, we look for a common 
component, z, of v, and ve which might account for their correlation: 


(3.5a) v) = UW — Ayz, 
(3.5b) V2 = Uz — Anxz. 


Substitution of these relations in (IV) brings us back immediately to 
systems like (II). This substitution replaces the unobservable v’s by 
unobservable w’s. Hence, we are not relieved of the necessity of postu- 
lating independence of the errors. We are more willing to make these 
assumptions in the three-variable case because we have explicitly re- 
moved from the error term the component z which we suspect is the 
source, if any, of the correlation of the v’s. 

Stated otherwise, introduction of the third variable, z, to test the 
genuineness or spuriousness of the correlation between x and y, is a 
method for determining whether in fact the v’s of the original two 
variable system were uncorrelated. But the test can be carried out 
only on the assumption that the unobservable error terms of the three 
variable system are uncorrelated. If we suspect this to be false, we must 
further enlarge the system by introduction of a fourth variable, and so 
on, until we obtain a system we are willing to regard as “complete” in 
this sense. 

Summarizing our analysis we conclude that: 

(1) Our task is to determine which of the six off-diagonal matrix 
coefficients in a system like (I) are zero. 

(2) But we are confronted with a system containing a total of nine 
variables (six coefficients and three unobservable errors), and only 
three equations. 

(3) Hence we must obtain six more relations by making certain a 
priori assumptions. 

(a) Three of these relations may be obtained, from considerations of 
time precedence of variables or analogous evidence, in the form of 
direct assumptions that three of the a;; are zero. 

(b) Three more relations may be obtained by assuming the errors 
to be uncorrelated. 


4. SPURIOUS CORRELATION 


Before proceeding with the algebra, it may be helpful to look a little 
more closely at the matrix of coefficients in systems like (I), (II), and 
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(III), disregarding the numerical values of the coefficients, but con- 
sidering only whether they are non-vanishing (X), or vanishing (0). 
An example of such a matrix would be 


xX 0 0 
X X X 
0 Oo xX 


In this case x and z both influence y, but not each other, and y in- 
fluences neither x nor z. Moreover, a change in wz—wu; and us being 
constant—will change y, but not x or z; a change in u, will change x 
and y, but not z; a change in us; will change z and y, but not z. Hence 
the causal ordering may be depicted thus: 


x 2 
NZ 
y 


In this case the correlation between z and y is “true,” and not 
spurious. 

Since there are six off-diagonal elements in the matrix, there are 
2° = 64 possible configurations of X’s and 0’s. The a priort assumptions 
(1), however, require 0’s in three specified cells, and hence for each 
such set of assumptions there are only 2? =8 possible distinct configura- 
tions. If (to make a definite assumption) x does not depend on y, then 
there are three possible orderings of the variables (z, x, y; 2, z, y} 2, Y, 
z), and consequently 3-8 = 24 possible configurations, but these 24 con- 
figurations are not all distinct. For example, the one depicted above 
is consistent with either the ordering (z, x, y) or the ordering (z, z, y). 

Still assuming that x does not depend on y, we will be interested, in 
particular, in the following configurations: 


xX 0 O 2 a xX 0 0O 
, a oe 4 Xx X O X X O 
0 xX 0 xX . a. oe 
(a) (8) (y) 


0 X 0 O 
X X X X 
0 X 0 X 


(8) (€) 





SPURIOUS CORRELATION: A CAUSAL INTERPRETATION 475 


In Case a, either x may precede z, or z, x. In Cases 6 and 4, z precedes 
x; in Cases y and e, x precedes z. The causal orderings that may be 
inferred are: 


x 2 x rd 
y y Zz x y 


(a) (8) (y) (8) (e) 


The two cases we were confronted with in our earlier examples of 
Section 1 were 6 and e, respectively. Hence, 6 is the case of spurious 
correlation due to z; « the case of true correlation with z as an inter- 
vening variable. 

We come now to the question of which of the matrices that are con- 
sistent with the assumed time precedence is the correct one. Suppose, 
for definiteness, that z precedes x, and x precedes y. Then a12.= dg: = de 
=0; and the system (I) reduces to: 


(4.1) x + sz = th, 

(4.2) Qn X + Y + Aexz = Us, 

(4.3) z= Us. 
Next, we assume the errors to be uncorrelated: 

(4.4) E(uyue) = E(uyus) = E(usus) = 0. 


Multiplying equations (4.1) —(4.3) by pairs, and taking expected 
values we get: 


(4.5) an E(2*) +E (ay) +a0sB (22) +a13[an¥ (22) + E(yz) + aE (e*) | 
= E(uyu) =0, 

(4.6) E(xz) + ay3H(z?) = E(uus) = 0, 
(4.7) Gy Ei (xz) + E(yz) + de3H(z?) = E(usus) = 0. 

Because of (4.7), the terms in the bracket of (4.5) vanish, giving: 
(4.8) OnE (x*) + E(xy) + aosH (xz) = 0. 

Solving for E (xz), E (yz) and E (zy) we find: 
(4.9) E(xz) = — aE (z’), 
(4.10) E(yz) = (dis@21 — G23) H(z’), 
(4.11) E(xy) = 443023H(z*) — anH(x*). 
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Case a: Now in the matrix of case a, above, we have a;;=0. Hence: 
(4.12a) E(az) = 0; (4.12b)E(yz) = — ae3H(z*), 
(4.12c) E(zy) = — dn i(x’). 
Case 8: In this case, a23=0, hence, 
(4.13a) E(az) = — aE (z*); (4.13b)E (yz) = ayanF(z*); 
(4.13¢) E(zy) = — anE(z’); 
from which it also follows that: 
E(yz) 


(4.14) E(ay) = E(x?) Bz) 





Case 6: In this case, aa =0. Hence, 

(4.15a) E(xz) = — aE (2); (4.15b)E(yz) = — ank(z*); 
(4.15¢) E(axy) = aisdxH(z*); 

and we deduce also that: 

_ E(ez)E(y2) 

 E(2*) 

We have now proved that a:;;=0 implies (4.12) ; that a23=0 implies 
(4.14); and that a2:=0 implies (4.16). We shall show that the converse 
also holds. 

To prove that (4.12a) implies a;3=0 we need only set the left-hand 
side of (4.9) equal to zero. 

To prove that (4.14) implies that a2:=0 we substitute in (4.14) the 


values of the cross-products from (4.9)—(4.11). After some simplifica- 
tion, we obtain: 


(4.16) E(zy) 


(4.17) Ae3[ E(x) — ays2E(z?)| = 0. 
Now since, from (4.1) 
(4.18) E(x?) — E(u?) + 2a;3E(2u1) = a43?E(z*), 


and since, by multiplying (4.3) by wu, we can show that E(zu;) =0, 
the second factor of (4.17) can vanish only in case E(u;?) =0. Excluding 
this degenerate case, we conclude that a23=0. 

To prove that (4.16) implies that a2.,=0, we proceed in a similar 
manner, obtaining: 


(4.19) Qn [E(2*) — ays*E(z?)] = 0, 
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from which we can conclude that a2,=0. 
We can summarize the results as follows: 
1) If E(xz) =0, E(yz) #0, E(xy) #0, we have Case a 
2) If none of the cross-products is zero, and 


E(yz) 


E(zy) = E(z*) E(z2) 





, 


we have Case 8. 
3) If none of the cross-products is zero, and 


E(az)E 
ni) = Be 


we have Case 6. 

We can combine these conditions to find the conditions that two or 
more of the coefficients a3, a23, de, vanish: 

4) If ai3=a23=0, we find that: 

E(xz) =0, E(yz) =0. Call this Case (a). 
5) If ai3=a2,=0, we find that: 
E(xz) =0, E(xy) =0. Call this Case (a6). 
6) If a23=an=0, we find that: 
E(yz) =0, E(xy) =0. Call this Case (66). 
7) If ay3=a23=a2,=0, then 
E(xz) = E(yz) = E(azy) =0. Call this Case (af). 

8) If none of the conditions (1)—(7) are satisfied, then all three co- 
efficients dis, G23, da, are non-zero. Thus, by observing which of the 
conditions (1) through (8) are satisfied by the expected values of the 
cross products, we can determine what the causal ordering is of the 
variables.’ 

We can see also, from this analysis, why the vanishing of the partial 
correlation of z and y is evidence for the spuriousness of the zero-order 
correlation between z and y. For the numerator of the partial correla- 
tion coefficient r,,.2, we have: 


E(ay) iy E(az)E(yz) : 
VE(@)E(y?) — E(2?) V E(x) E(y?) 


We see that the condition for Case 6 is precisely that r,,., vanish 
while none of the coefficients, rzy, fzs, Tys Vanish. From this we conclude 


(4.20) N(ray.2) = 








7 Of course, the expected values are not, strictly speaking, observables except in a probability 
sense. However, we do not wish to go into sampling questions here, and simply assume that we have 
good estimates of the expected values. 
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that the first illustrative example of Section 1 falls in Case 6, as pre- 
viously asserted. A similar analysis shows that the second illustrative 
example of Section 1 falls in Case e. 

In summary, our procedure for interpreting, by introduction of an 
additional variable z, the correlation between z and y consists in making 
the six a priori assumptions described earlier; estimating the expected 
values, E(xy), E(xz), and E(yz); and determining from their values 
which of the eight enumerated cases holds. Each case corresponds to a 
specified arrangement of zero and non-zero elements in the coefficient 
matrix and hence to a definite causal ordering of the variables. 


5. THE CASE OF EXPERIMENTATION 


In sections (3)—(4) we have treated wu, we and us as random variables. 
The causal ordering among 2, y, and z can also be determined without 
a priori assumptions in the case where w, uve, and us are controlled by 
an experimenter. For simplicity of illustration we assume there is time 
precedence among the variables. Then the matrix is triangular, so that 
a;;~0 implies a;;=0; and a;;#0, a%,+0 implies a,;=0. 

Under the given assumptions at least three of the off-diagonal a’s in 
(I) must vanish, and the equations and variables can be reordered so 
that all the non-vanishing coefficients lie on or below the diagonal. If 
(with this ordering) wz or us are varied, at least the variable determined 
by the first equation will remain constant (since it depends only on 
u;). Similarly, if us is varied, the variables determined by the first and 
second equations will remain constant. 

In this way we discover which variables are determined by which 
equations. Further, if varying u; causes a particular variable other than 
the ith to change in value, this variable must be causally dependent on 
the zth. 

Suppose, for example, that variation in wu; brings about a change in 
x and y, variation in wz a change in y, and variation in uw; a change in 
x, y, and z. Then we know that y is causally dependent upon z and z, 
and z upon z. But this is precisely the Case 6 treated previously under 
the assumption that the u’s were stochastic variables. 


6. CONCLUSION 


In this paper I have tried to clarify the logical processes and assump- 
tions that are involved in the usual procedures for testing whether 
a correlation between two variables is true or spurious. These pro- 
cedures begin by imbedding the relation between the two variables in 
a larger three-variable system that is assumed to be self-contained, 
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except for stochastic disturbances or parameters controlled by an ex- 
perimenter. 

Since the coefficients in the three-variable system will not in general 
be identifiable, and since the determination of the causal ordering 
implies identifiability, the test for spuriousness of the correlation re- 
quires additional assumptions to be made. These assumptions are 
usually of two kinds. The first, ordinarily made explicit, are assump- 
tions that certain variables do not have a causal influence on certain 
others. These assumptions reduce the number of degrees of freedom of 
the system of coefficients by implying that three specified coefficients 
are zero. 

The second type of assumption, more often implicit than explicit, 
is that the random disturbances associated with the three-variable 
system are uncorrelated. This assumption gives us a sufficient number 
of additional restrictions to secure the identifiability of the remaining 
coefficients, and hence to determine the causal ordering of the variables. 
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EMPIRICAL STUDY OF THE ACCURACY OF SELECTED 
METHODS OF PROJECTING STATE POPULATIONS* 


HEtEN R. Waite 
United States Bureau of Agricultural Economics 


As tentative guides in the preparation of population projec- 
tions for geographic subdivisions of the United States, the 
accuracy in the past of several methods of projecting popula- 
tion has been measured. These measures have been analyzed 
to some extent for information on the effects of selected factors 
other than methodology on the accuracy of projections. 


RRORS in particular population projections have been noted and 
i analyzed in the literature of this field, and various methods of 
projecting population have been evaluated.'! Nevertheless, little has 
been written on the accuracy of population projections in general or 
about the effects of such factors as the size of the base population, past 
migration rates, and length of the projection period on the accuracy 
of population projections. The test described below was undertaken in 
order to provide some guides in deciding what methods should be used 
in projecting the populations of geographic subdivisions of the United 
States and in deciding whether any projections should be prepared in 
certain cases. 

Design of test—The study is based on a comparison of projections 
to 1940 and 1950 of the 1930 Census population, for each state and for 
the District of Columbia, prepared by various methods, with the 
Census data for those dates. Since the projections prepared by Thomp- 
son and Whelpton and published in Estimates of Future Population by 
States (National Resources Board, 1934) are based on the 1930 Census, 
these projections could be used to represent the cohort-survival 
method.? The other methods by which projections have been prepared 
are limited to those which did not involve extensive computing and 





* This paper is based on a project of the Bureau of the Census carried on while the author was 
employed at that agency. The assistance of Mrs. Beatrice M. Rosen of the Bureau of the Census is 
gratefully acknowledged. A summary of the results of this study was presented at a meeting of the 
Population Association of America on April 19, 1952. 

1 See Appendix. 

2 The cohort-survival method involves making separate allowances for changes in each of its age 
cohorts resulting from mortality and immigration; the initial size of cohorts born after the base date 
are usually based on projected age-specific or cohort birth rates. More detailed descriptions of specific 
applications of this method are given in: P. K. Whelpton, “An Empirical Method of Calculating 
Future Population,” Journal of the American Statistical Association, 31 (1936), pp. 457-73; P. K. 
Whelpton, Hope Tisdale Eldridge, and Jacob 8S. Siegel, Forecasts of the Population of the United States, 
1945-1976, U. 8. Government Printing Office, Washington, 1947; and Jacob S. Siegel and Helen R. 
White, “Illustrative Projections of the Population of the United States, 1956 to 1960,” Current Popula- 
tion Reports, Series P-25, No. 43, U. 8. Bureau of the Census, August 1950. 
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for which the results would not be biased by the worker’s knowledge 
of population trends since 1930. These methods are the geometric, 
arithmetic, apportionment, and ratio methods. 

Both the apportionment method and the ratio method require in- 
dependent projections of the total population of the United States for 
1940 and 1950. For this purpose, the sum of the Thompson-Whelpton 
projections for states, with an allowance for migration, and the actual 
decennial census national totals* were used. Although the cohort- 
survival, geometric, and arithmetic methods do not necessarily involve 
the use of independent national totals, the latter two were also adjusted 
to the Thompson-Whelpton projections and all three were adjusted to 
the census national totals. 

The independent projection of the total population of the United 
States has been used as a control total (that is, the state projections 
have been forced to sum to the independent national projection) in the 
instances mentioned above. Such use is not an inherent characteristic 
of the ratio method but arises from the adjustment of the appropriate 
ratios to sum to 1.00. Projections were also obtained by one variation 
of the ratio method without adjustment of the ratios. These projections 
are referred to as “Ratio III (unadjusted)” in the text and are pre- 
sented in the tables under “Unadjusted to national total.” 

As the preparation of projections is not justified unless the results 
are better than those obtained by using the available current figures 
on the size of the population, measures of the errors involved in using 
the 1930 Census data for 1940 and 1950 have also been developed. The 
figures are presented in the various tables (under the designation of 
“Constant”) along with the results for the various methods. 

Thus the tables present results for the following methods and as- 
sumptions: 

I. Unadjusted to national total. 

1. Cohort-survival, with migration, the cohort-survival method, as- 
suming continuation of internal migration like 1920-30 (see Estimates 
of Future Population by States, mentioned above, for information on the 
basic assumptions). 

2. Cohort-survival, no migration, the same as (1) above except for the 
‘assumption of no internal migration. 

3. Geometric method, assuming the continuation of the 1920-30 
average annual rate of increase. 





* The 1950 Census total was adjusted to include members of the armed forces overseas except 
those inducted in the Territories and possessions. The 1940 total was not adjusted because of the small 
number of armed forces involved. 
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4. Arithmetic method, assuming continuation of the 1920-30 average 
amount of increase per year. 

5. Constant, the 1930 enumerated population. 

6. Ratio III (1900 to 1930 modified), T & W national projection, the 
ratio method, using the rules presented in Current Population Reports, 
Series P-25, No. 56,‘ for selecting the period used in computing the 
initial change in the ratio of the population of each division to the total 
population of the United States and the ratio of the population of each 
state to the population of its divisions; these rules, which in this case 
were applied to 1900-30, 1910-30, and 1920-30, first eliminate any 
period during which the given ratio did not either constantly increase 
or constantly decrease, and then select, from the remaining periods, 
the one for which the absolute value of the average annual rate of 
change in the given ratio was least. It was also assumed that the annual 
rate of change of each ratio would decrease linearly to zero within 
fifty years; i.e., by 1980. As these ratios were not adjusted to sum to 
1.00, the projected state populations do not sum to the Thompson- 
Whelpton national projection to which the ratios were applied. 

7. Ratio III (1900 to 1930 modified), census count, the same as (6) 
above except that the projected ratios were applied to the census na- 
tional totals. 

II. Adjusted to T & W national projection, using as a control total 
the independent T &W projection (with internal migration) of the total 
population of the United States. 

1. Geometric, the state projections of (I, 3) above adjusted propor- 
tionately to sum to the T & W national projection. 

2. Arithmetic, the state projections of (I, 4) above adjusted pro- 
portionately to sum to the T & W national projection. 

3. Apportionment method, assuming (a) that the increase in the 
total population of the United States, as indicated by the T & W pro- 
jection, would be distributed in accordance with the distribution of the 
1920-30 increase among those states whose population gained between 
1920 and 1930 and assuming (b) that the populations of those states 
in which there was a decrease during that period would remain con- 
stant. 

4. Ratio I, (1870 to 1980), the ratio method, involving the projection 
of the ratio of the total population of each state to the total population 
of the United States on the assumptions (a) that the initial change in 
the ratio would be the same as the 1870-1930 average annual rate of 





4 Helen L. White and Jacob 8S. Siegel, “Projections of the Population by States: 1955 and 1960.” 
Current Population Reports, Series P-25, No. 56, Bureau of the Census, January 1952. 
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change in the ratio, and (b) that the annual rate of change in the ratio 
would decrease linearly to zero by 1975; the projected ratios were ad- 
justed to sum to 1.00 and were then applied to the T & W national 
projection. 

5. Ratio II (1980), the ratio method, assuming that the ratios 
would remain at the 1930 level; this assumes that the per cent increase 
in the population of each state would be the same as that of the T & W 
national projection to which the assumed ratios were applied. 

6. Ratio III (1900 to 1980 modified), using the same assumptions 
as for (I, 6). The ratios were adjusted to sum to 1.00 and were then 
applied to the T & W national projection. 

III. Adjusted to census count, using the census count as a control 
total. 

1. Cohort-survival, with migration, the state projections of (I, 1) 
adjusted proportionately to sum to the census count of the total popu- 
lation. 

2. Cohort-survival, no migration, the state projections of (I, 2) ad- 
justed proportionately to sum to the census count. 

3. Geometric, the state projections of (I, 3) adjusted proportionately 
to sum to the census count. 

4. Arithmetic, the state projections of (I, 4) adjusted proportionately 
to sum to the census count. 

5. Apportionment, assuming (a) that the actual increase in the total 
population of the United States, from the census count, would be dis- 
tributed in accordance with the distribution of the 1920-30 increase 
among those states which gained population during that period and 
(b) that the population of those states in which there was a decrease 
during that period would remain constant. 

6. Ratio I (1870 to 1930), applying the ratios of (II, 4) to the census 
count. 

7. Ratio II (1930), applying the ratios of (II, 5) to the census count. 

8. Ratio III (1900 to 1980 modified), using the same assumptions as 
for (I, 6). The ratios were adjusted to sum to 1.00 and were then ap- 
plied to the census count. 

As mentioned previously, the projections for 1940 and for 1950, for 
each state, described above, were compared with the 1940 and 1950 
Census returns. Before the comparison was made, the 1950 Census 
data for each state were adjusted to include members of the armed 
forces who resided in the given state at the time of induction and to 
exclude members of the armed forces stationed in the given state who 
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did not reside there at the time of induction.’ The deviations of the 
projections from the census data are summarized in Table 1, which 
shows the average per cent error (average of the absolute values of the 
per cent deviations), the maximum per cent error (absolute value), 
the proportion of errors of ten per cent or more (absolute values), and 
the proportion of positive errors (over-estimates). 

It must be kept in mind that the results presented here are only 
very rough guides for future periods. 

Accuracy of various methods.—The various methods are evaluated on 
the basis of the summary measures shown in Table 1 for the unad- 
justed projections and the projections adjusted to the Thompson- 
Whelpton national projections. The projections adjusted to the census 
counts are not considered, since census data could not be used for this 
purpose in actual practice. 

One of the rather interesting results of this study is that the cohort- 
survival method does not appear to yield definitely superior results. 
(It must be remembered, however, that this method has several ad- 
vantages over other methods not wholly dependent on its absolute 
validity.) In fact, no one method is clearly superior to all other methods 
tested and only the Ratio I method is clearly inferior. The Ratio I 
method is probably inferior because the basic assumptions place too 
much emphasis on population change in relatively remote periods; none 
of the other projections depend so much on population change prior to 
1900. 

For 1940, the apportionment, the cohort-survival (with migration), 
the Ratio II, and the Ratio III (unadjusted) results are the best on 
the basis of average per cent error. On the basis of the proportion of 
errors of 10 per cent or more, the apportionment, the cohort-survival 
(with migration), and the Ratio III (both unadjusted and adjusted) 
results are the best. 

For 1950, the arithmetic (unadjusted), the Ratio III (unadjusted), 





5 The effects of military migration during the past decade were removed because they are believed 
to represent an abnormality which ordinarily could not be taken into account by any of the methods 
of projecting population. Dr. Henry 8. Shryock, Jr., has commented: “At first I thought that your 
adjusting the Census data to what we sometimes call the de jure population level was the wrong thing 
to do here. The members of the armed forces stationed in the several states are the result of what may 
be viewed as military migration. Since we are usually interested in forecasting the number that the 
Census will count at a future date, it would seem appropriate to include the armed forces where they 
would be enumerated. Furthermore, some members of the armed forces were stationed in the several 
states in 1930. On the other hand, I realize what you are trying to do is to remove direct effects of the 
defense preparations on the distribution of population among the states. This attempt is consistent with 
the usual practice in national projections to assume that a war will not be going on or be in prospect at 
the future dates for which projections are made. Of course, if the cold war continues long enough, we 
may come to consider the resulting size and distribution oi the armed forces as normal, and in this 
case we might hesitate to predict the future distribution of population under peaceful conditions.” 
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the cohort-survival (with migration), the apportionment, and the 
Ratio II results are the best on the basis of average per cent error, each 
of these methods having an APE under 13. On the basis of the propor- 


TABLE 1—SUMMARY OF PER CENT ERRORS OF PROJECTIONS 
TO 1940 AND 1950 OF THE POPULATIONS OF THE STATES, 
FOR SELECTED METHODS 








Proportion of 
Average i errors of 10 
per cent per cent or more 

error (expressed as 
& per cent) 





1940 | 1950 





Unadjusted to national total 
Cohort-survival (T&W) 
With migration 
No migration 
Geometric 
Arithmetic 
Constant 
Ratio III (1900 to 1930 
modified) 
T&W national projec- 
tion* 
Census count 
Adjusted to T&W national 
projection* 
Geometric 
Arithmetic 
Apportionment 
Ratio I (1870 to 1930) 
Ratio II (1930) 
Ratio III (1900 to 1930 
modified) 
Adjusted to census count 
Cohort-survival (T&W) 
With migration 
No migration 
Geometric 
Arithmetic 
Apportionment 
Ratio I (1870 to 1930) 
Ratio II (1930) 
Ratio III (1900 to 1930 
modified) 
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* With migration. 


tion of errors of 10 per cent or more, the Ratio III (unadjusted), the 
cohort-survival (with migration), and the arithmetic (unadjusted) pro- 
jections are the best. 

Apparently the cohort-survival (with migration), the apportionment, 
and the Ratio III (unadjusted) results make the consistently best show- 
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ings. The differences between most of the various summary measures, 
however, are too small to justify any definite conclusions except with 
regard to the unfavorable showing of Ratio I. 

It does appear that projections prepared by any of the methods 
mentioned above as being consistently best will be better guides for 
ten and twenty years in the future than the most recently available 
census data or current estimates. Assuming that the state populations 
would be the same in 1950 as in 1930 yields an APE of 19; also, 74 
per cent of the “projections” are in error by 10 per cent or more. These 
values are notably higher than those for the cohort-survival (with 
migration), the apportionment, and the Ratio III (unadjusted) 
methods. 

Control totals—It has generally been assumed that the best current 
estimates® of the populations of the states are obtained by adjusting 
the various estimates for the states to add to comparable estimates 
for the United States. If this is true of current estimates, it would not 
seem unreasonable to expect it to be true of projections. Hagood and 
Siegel made this assumption in preparing their article, “Projections 
of the Regional Distribution of the Population of the United States 
to 1975;”7 the method described by them involves adjusting the ap- 
propriate ratios to sum exactly to 1.00. 

The hypothesis that the use of independent control totals increases 
the accuracy of state projections can be tested by comparing the 
summary measures for the unadjusted projections with those for the 
adjusted projections, both those adjusted to the Thompson-Whelpton 
national totals and those adjusted to the census counts, for each 
method.® (The constant, apportionment, Ratio I, and Ratio II meth- 
ods cannot be included in this comparison, of course.) Although the 
results are somewhat inconclusive, the value of the use of control totals 
appears to be questionable. For 1950, the APE for each of the methods 
for which a comparison can be made, with the exception of the cohort- 
survival method (with migration), is somewhat lower for the unad- 
justed projections than for the adjusted projections. Even the ad- 
justed Ratio III projections, which involve the use of divisional con- 
trol totals, have a slightly higher APE than the unadjusted Ratio III 
projections. 





6 “Estimate” is used here as meaning a figure for a current date which usually does not depend on 
extrapolation to any major extent. 

7 Margaret Jarman Hagood and Jacob 8. Siegel, “Projections of the Regional Distribution of the 
Population of the United States to 1975,” Agricultural Economics Research, III (1951), pp. 41-52. 

8 As the base populations are free of error in comparison with the implicit projections of population 
change, it can be argued that the adjustment of the projections should have been in proportion to the 
projected change in the population, or some other factor, rather than the projected populations. It 
may be desirable to include this alternative in future tests of the sort described here. 
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It is obvious, of course, that adjusting to the actual census count 
reduces the sum of the differences between the projections and the 
actual population to zero and, further, that if all states had the same 
percentage error, proportionate adjustment to the census count would 
eliminate all errors. However—and this is apparently an important 
however—the value of a proportionate adjustment depends on the 
accuracy of the control total and on the distribution of the errors. If the 
errors are randomly distributed with regard to direction (that is, if the 
number of positive errors is approximately 50 per cent of all errors), 
then adjustments can only introduce a bias toward over-estimating 
or under-estimating. If all of the gross errors are positive or all are 
negative, and the errors in the projections for the remaining states 
are negligible, then proportionate adjustment will tend to decrease 
but not eliminate, the gross errors at the cost of increasing the errors 
in the projections for the remaining states. Thus, for 1950, adjusting 
introduced (or added to) a bias towards under-estimating in the Ratio 
III, geometric, and arithmetic methods. On the other hand, adjusting 
generally but not consistently yielded better results for 1950 in terms 
of the maximum per cent error and proportion of errors of ten per cent 


or more. 
The projections adjusted to the actual census count are generally 
better than those adjusted to a national projection, as might be ex- 


pected. 

Length of projection period.—Inspection of Table 1 shows that the 
projections for 1940, which involve a 10-year projection period, are 
subject to smaller errors than the projections for 1950, which involve 
a 20-year projection period. For all of the methods combined, the APE 
for 1940 is 7, and 22 per cent of projections are in error by 10 per cent 
or more; for 1950, these measures are 15 and 54 per cent, respectively. 
This should be a warning against the claim sometimes made that 
population projections are satisfactory guides for the long-run trend 
even when they deviate in the short-run. 

Size of population and migration rate——It seems worth while to in- 
vestigate the relation of the accuracy of the projections to the size 
of the base population and to the size of the migration rate charac- 
teristic of the area prior to the projection period. Correlation coeffi- 
cients and regression equations would yield the most useful measures 
of the relation between these items. However, the measures shown in 
Table 2 and in Table 3 are indicative of the relations. Table 2 shows 
summary measures of the errors in the 1950 projections for the 24 
states with populations of 1.88 million or more in 1930 and the 25 
states with populations of 1.85 million or less in 1930; Table 3 shows 
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similar measures for the 24 states with 1920-30 average migration 


rates of 0.5 or less per year (regardless of the direction of the migration) \ 
and the 25 states with 1920-30 average migration rates of 0.6 or more 
per year.® ; 


TABLE 2.—SUMMARY OF PER CENT ERRORS OF PROJECTIONS 
TO 1950 OF THE POPULATIONS OF THE STATES, FOR SELECTED 
METHODS, BY SIZE OF POPULATION IN 1930 















































Proportion of Pp . 
Average Maximum errors of 10 per positive — 
per cent per cent cent or more (expressed 
error error (expressed as ox -” 
& per cent) 
Method @ per cent) 
24 25 24 25 24 25 24 25 
largest | smallest | largest | smallest | largest |smallest| largest |smallest 
states states states states | states | states | states | states 
Unadjusted to national total 
Cohort-survival (T&W) 
With migration 8.43 | 16.45 | 27.54] 39.52} 33.3] 68.0] 12.5] 16.0 
No migration 10.88 | 19.17 | 45.32 | 43.45 | 45.8] 64.0] 37.5] 32.0 
Geometric 14.44 | 12.23 | 47.40 | 24.70] 50.0] 56.0] 70.8] 36.0 
Arithmetic 8.74 | 12.99 | 38.38 | 25.86 | 33.3 | 68.0] 66.7 | 32.0 
Constant 16.12 | 22.29} 46.02 | 46.89] 70.8] 76.0 4.2 12.0 
Ratio III (1900 to 1930 
modified) 
T&W national projec- 
tion* 8.62 | 16.15 | 24.50 | 34.67 | 29.2] 68.0] 16.7] 16.0 
Census count 7.18 12.38 | 34.45 | 28.68 16.7 56.0 | 62.5] 28.0 
Adjusted to T&W national 
projection* 
Geometric 14.26 | 19.96 | 33.51 | 37.10 | 75.0] 76.0] 20.8 4.0 
Arithmetic 9.73 | 18.16 | 28.13 | 34.23 | 37.5] 76.0] 16.7] 12.0 
Apportionment 8.37 | 16.88 | 26.39 | 34.75 | 33.3 | 72.0] 16.7] 16.0 
Ratio I (1870 to 1930) 19.36 | 43.64 | 105.50 | 276.38 | 79.2] 88.0] 12.5] 40.0 
Ratio II (1930) 8.73 | 16.71 | 39.13 | 40.11! 37.5] 68.0 | 33.3] 24.0 
Ratio III (1900 to 1930 
modified) 9.79 | 17.93 | 26.41 | 36.61] 37.5] 72.0 8.3 | 12.0 ; 
Adjr sted to census count i 
Cohort-survival (T&W) 
With migration 7.55 | 13.41] 35.69] 33.98] 20.8] 56.0] 66.7] 32.0 ‘i 
No migration 12.33 | 19.07 | 51.80 | 55.35] 37.5 | 64.0] 66.7] 52.0 q 
Geometric 11.62 | 15.17 | 31.23 | 31.34] 45.8] 68.0] 33.3] 24.0 4 
Arithmetic 7.71 14.08 | 33.98 | 28.21 16.7 | 72.0] 54.2] 24.0 
Apportionment 7.26 | 13.75 | 33.61 | 28.36 | 20.8} 60.0] 58.3] 24.0 
Ratio I (1870 to 1930) 15.12 | 45.99 | 124.31 | 310.83 | 50.0] 84.0! 20.8] 48.0 
Ratio II (1930) 9.45 | 15.62] 33.56 | 34.63 | 33.3] 56.0] 70.8] 44.0 ; 
Ratio III (1900 to 1930 ; 
modified) 6.72 | 13.71 | 32.54 | 30.81 16.7 | 64.0] 50.0] 24.0 4 
* With migration. 
‘For this purpose, the absolute values of the rates were used. They were obtained from Henry S. . 





Shryock, Jr., “Internal Migration and the War” (Journal of the American Statistical Association, 38 
(1943), pp. 16-30). 
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nh Table 2 suggests definitely that, for a given method and a given 
) length of projection period, the errors of projections tend to be larger 
e 


TABLE 3.—SUMMARY OF PER CENT ERRORS OF PROJECTIONS 
TO 1950 OF THE POPULATIONS OF THE STATES, FOR 












































S SELECTED METHODS, BY AVERAGE, MIGRATION 
) RATE FOR 1920-1930 
= Proportion of ‘ 
Average Maximum errors of 10 per Proportion of 
positive errors 
per cent per cent cent or more 
(expressed as 
error error (expressed as a per cent) 
&® per cent) 
Method 
24 25 24 25 24 25 24 25 
states states states states | states | states | states | states 
' with with with with with with with with 
smallest | largest | smallest | largest |smallest| largest |smallest| largest 
rates rates rates rates rates rates rates | rates 
Unadjusted to national total 
Cohori-survival (T&W) 
With migration 11.07 13.92 | 39.52 | 32.87] 41.7] 60.0] 12.5] 16.0 
No migration 12.79 17.384 | 40.17 | 45.32 | 50.0} 60.0; 20.8] 48.0 
Geometric 11.61 14.95 | 47.40 | 44.56] 45.8] 60.0} 62.5] 44.0 
Arithmetic 9.42 12.33 | 38.38] 25.01] 37.5] 64.0] 62.5] 36.0 
Constant 19.24 19.30 | 42.64] 46.89 | 79.2} 68.0 4.2 12.0 
Ratio III (1900 to 1930 
modified) 
T&W national projec- 
tion* 10.99 13.88 | 34.67 | 31.89 | 41.7} 56.0] 16.7 16.0 
Census count 8.43 11.18 | 34.45 | 25.66 | 20.8} 52.0] 54.2/ 36.0 
Adjusted to T&W national 
projection* 
Geometric 15.15 | 19.10} 385.76 | 37.10 | 75.0} 76.0 8.3 16.0 
Arithmetic 11.76 16.21 | 34.23 | 33.48} 650.0 | 64.0] 12.5] 16.0 
Apportionment 11.00 | 14.36} 34.75 | 31.57 | 50.0] 56.0 8.3 | 24.0 
Ratio I (1870 to 1930) 24.91 | 38.32 | 105.50 | 276.38 | 79.2] 88.0] 20.8] 32.0 
Ratio II (1930) 10.85 | 14.68] 35.32] 40.11} 50.0] 56.0] 20.8] 36.0 
Ratio III (1900 to 1930 
modified) 12.56 15.26 | 36.61 | 33.61 50.0 | 60.0 4.2] 16.0 
Adjusted to census count 
2 Cohort-survival (T&W) 
4 With migration 8.93 | 12.09 | 35.69 | 26.72 | 29.2] 48.0] 54.2] 44.0 
No migration 11.54 | 19.83 | 51.80 | 55.85] 383.3] 68.0] 54.2] 64.0 
Geometric 10.76 16.00 | 31.23 | 31.34] 41.7 | 72.0] 29.2] 28.0 
Arithmetic 8.90 | 12.94] 33.98| 27.40 | 25.0] 64.0] 45.8] 32.0 
Apportionment 8.68 | 12.39] 33.61] 26.86 | 20.8] 60.0] 50.0] 32.0 
: Ratio I (1870 to 1930) 22.58 | 38.82 | 124.31 | 310.838 | 58.3; 76.0] 33.3] 36.0 
] Ratio II (1930) 9.66 | 15.41] 31.07] 34.63 | 37.5] 52.0| 54.2] 60.0 
Ratio III (1900 to 1930 
modified) 8.63 | 11.88 | 32.54] 27.54] 25.0] 56.0] 41.7] 382.0 





* With migration. 





: as the size of the population on which the projections are based be- 
comes smaller. Thus, for all methods shown in Table 2, the APE of the 
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projections for 1950 for the 24 states with the larger populations is 11, 
while the APE for the other 25 states is 19. Also, 39 per cent of the 
projections of the larger populations are in error by 10 per cent or 
more, while 68 per cent of the projections of the smaller populations 
are in error by 10 per cent or more. 

The errors also tend to be Jarger for the states with the larger average 
migration rates, according to Table 3. The APE for the 24 states with 
the smaller migration rates is 12, while the APE for the other states 
is 17. Errors of 10 per cent or more occur in 45 per cent of the projec- 
tions for the first 24 states and in 63 per cent of the projections for the 
second 25 states. 

As might be expected, the projections for the 14 states which are 
both among the 24 states with larger populations and among the 24 
states with the smaller migration rates, have a smaller average per cent 
error (10) and a smaller proportion (33 per cent) of errors of 10 per 
cent or more than either of the two complete groups. 

Need for additional research—Even though the results of this test 
are inconclusive, they are probably of sufficient value and interest to 
warrant additional research along the same general lines. Other areas 
needing additional research are numerous. A few of these areas are men- 
tioned below. Failure to include the logistic method is one of the more 
obvious gaps of this study. The question of using controls at any level 
and the question of a single national control versus controls at the 
national and various intermediate levels, should be explored further. 
In the light of the results for the several ratio methods, the best balance 
in the basic assumptions between emphasis on long-run trend and 
emphasis on recent experience, should be investigated. In connection 
with this, some measures relating to the possibility of allowing spe- 
cifically for various economic conditions in the future by the use of 
“representative” trends from appropriate past periods should be ob- 
tained. Because any period represents unique experience, projections 
for additional combinations of periods should be studied. In addition, 
the significance of the various measures should be tested. 

Conclusions.—Although not one of the methods tested is clearly 
superior to the others, the cohort-survival (with migration), the ap- 
portionment, and the Ratio III (unadjusted) results make the con- 
sistently best showings on the basis of average per cent error and 
proportion of errors of 10 per cent or more. 

The following hypotheses, while not proven by this test, are con- 
sistent with the results obtained: 

1. Projections obtained by these three methods will be better guides 
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1, for ten and twenty years in the future than the most recent data on 
- current population size. 

al 2. The value of the use of independent control totals is questionable. 
- 3. The errors of projections tend to increase almost directly as the 






length of the projection period increases. 
4. The errors of projections tend to be larger for areas with smaller 


base populations. 
5. The errors of projections tend to be larger for areas with larger 


net migration rates in the recent past.!® 
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Discussions of the accuracy of population projections will be found 
in the following items: 


Davis, J. 8., The Population Upsurge in the United States, War-Peace Pamphlets 
No. 12, Stanford: Food Research Institute, Stanford University, 1949. 
Dorn, Harold F., “Pitfalls in Population Forecasts and Projections,” Journal of 
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tion Studies, III (1950), 406-12. 

Wilson, Edwin B., and Ruth R. Puffer, “Least Squares and Laws of Population 
Growth,” Proceedings of the American Academy of Arts and Sciences, 68 
(1933), 285-382. 


Descriptions of various methods of projecting population, references 
to pertinent literature, comments on the development of projections, 
and a priori evaluations of various methods, will be found in the follow- 
ing items: 





























Hagood, Margaret Jarman, and Jacob S. Siegel, “Projections of the Regional 








10 An article by Robert C. Schmitt and Albert H. Crosetti entitled “Accuracy of the Ratio Method 
for Forecasting City Population” (Land Economics, XXVII (1951), pp. 346-48), has just come to the 
attention of the author. This article describes a test of the accuracy of the ratio method in predicting 
the population of selected large cities and of variations in accuracy with length of projection period, 
size of population, and growth rate. The findings of Schmitt and Crosetti are in agreement with (4) 
above but not with (5) above. It is possible that the results would agree if coefficients of partial correla- 
tion had been used to measure the association of accuracy or projections, size of base population, and 
growth rates or migration rates. 













492 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954 


Distribution of the Population of the United States to 1975,” Agricultural 
Economics Research, III (1951), 41-52. 

Notestein, Frank W., et al., The Future Population of Europe and the Soviet Union, 
Geneva: League of Nations, 1944, 199-234. 

Reed, Lowell J., “Population Growth and Forecasts,” 7 he Annals of the American 
Academy of Political and Social Science, 188 (1936), 159-66. 
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the Americas,” Estadistica, 7 (1944), 323—46. 
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APPENDIX TABLE A.—PROJECTIONS TO 1940 OF THE 1930 
POPULATIONS OF THE STATES, BY SELECTED METHODS 
(In thousands. Each figure has been independently rounded) 























Unadjusted to national total 
Ms nad Cohort-survival Ratio III (1900 to 
State ated (T&W) ; 1930 modified) 
popu- ; Geo- | Arith- | constant 
lation With No metric metic T&W na- Giiides 
migra- | migra- tional pro-| 
tion tion jection* count 
United States 131,669 |131,865 |132,098 |143,401 | 139,423 | 122,775 | 133,011 | 132,813 
Albama 2,833 | 2,801] 3,024| 2,973 2,937 2,646 2,767 2,763 
Arizona 499 513 491 564 535 436 527 527 
Arkansas 1,949 | 1,898] 2,113] 1,960 1,954 1,854 1,881 1,878 
California 6,907 | 6,808 | 5,810} 9,290 7,873 5,677 7,915 7,903 
Colorado 1,123 | 1,082] 1,104] 1,139 1,130 1,036 1,084 1,082 
Connecticut 1,709 | 1,726] 1,682] 1,863 1,828 1,607 1,753 1,751 
Delaware 267 248 249 254 253 238 242 241 
District of Col. 663 513 488 540 535 487 512 512 
Florida 1,897 | 1,716] 1,557 | 2,203 1,956 1,468 1,859 1,856 
Georgia 3,124} 2,915] 3,308| 2,921 2,921 2,909 2,877 2,873 
Idaho 525 455 504 458 458 445 453 453 
Illinois 7,897 | 8,177] 7,933] 8,943 8,748 7,631 8,291 8,278 
Indiana 3,428 | 3,405] 3,398] 3,570 3,539 3,239 3,328 3,323 
lowa 2,538 | 2,497 | 2,649] 2,538 2,536 2,471 2,441 2,438 
Kansas 1,801 | 1,940] 2,035] 1,997 1,990 1,881 1,887 1,884 
Kentucky 2,846 | 2,732] 2,955] 2,823 2,808 2,615 2,665 2,661 
Louisiana 2,364 | 2,265] 2,357] 2,446 2,397 2,102 2,261 2,258 
Maine 847 823 852 827 826 797 786 785 
Maryland 1,821 | 1,730] 1,710] 1,831 1,809 1,632 1,724 1,721 
Massachusetts | 4,317} 4,458] 4,398] 4,677 4,637 4,250 4,423 4,416 
Michigan 5,256 | 5,550] 5,230] 6,349 5,988 4,842 5,653 5,645 
Minnesota 2,792 | 2,647 | 2,767] 2,749 2,736 2,564 2,598 2,595 
Mississippi 2,184 | 2,137] 2,293] 2,033 2,224 2,010 2,085 2,082 
Missouri 3,785 | 3,709 | 3,815] 3,864 3,849 3,629 3,645 3,639 
Montana 559 528 588 527 527 538 535 534 
Nebraska 1,316 | 1,415] 1,501] 1,463 1,458 1,378 1,384 1,382 
Nevada 110 92 95 107 104 91 98 98 
New Hampshire 492 476 480 488 487 465 463 463 
New Jersey 4,160 | 4,488] 4,218] 5,144 4,905 4,041 4,697 4,690 
New Mexico 532 466 496 495 485 423 457 456 
New York 13,479 | 13,709 | 12,989 | 15,187 | 14,737 | 12,588 | 13,798 | 13,778 
North Carolina | 3,572 | 3,575 | 3,682 | 3,907 3,767 3,170 3,521 3,515 
North Dakota 642 702 790 716 714 681 680 679 
Ohio 6,908 | 7,107 | 6,956] 7,644 7,512 6,647 7,163 7,153 
Oklahoma 2,336 |} 2,620] 2,785] 2,819 2,755 2,396 2,611 2,607 
Oregon 1,090 | 1,028 978 | 1,156 1,120 954 1,081 1,079 
Pennsylvania 9,900 | 10,086 | 10,203 | 10,612 | 10,520 9,631 | 10,099] 10,084 
Rhode Island 713 732 717 780 769 687 725 724 
South Carolina} 1,900] 1,781 | 2,000] 1,794 1,792 1,739 1,758 1,755 
South Dakota 643 728 783 753 748 693 706 705 
Tennessee 2,916 | 2,754 | 2,937] 2,920 2,888 2,617 2,719 2,715 
Texas 6,415 | 6,469] 6,506] 7,236 6,958 5,825 6,587 6,577 
Utah 550 559 584 572 565 508 539 538 
Vermont 359 365 379 367 367 360 340 340 
Virginia 2,678 | 2,495] 2,682] 2,537 2,532 2,422 2,454 2,450 
Washington 1,736 | 1,658] 1,620] 1,795 1,765 1,563 1,693 1,691 
West Virginia 1,902 | 1,913] 1,979] 2,035 1,988 1,729 1,908 1,905 
Wisconsin 3,138 | 3,123 | 3,195 | 3,273 3,238 2,939 3,093 3,088 
Wyoming 251 247 251 261 256 226 245 244 


























* With migration. 
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APPENDIX TABLE A.—Continued 
(In thousands. Each figure has been independently rounded) 


























Adjusted to T&W national projection* 
State - Ratio I P Ratio III 
Geometric | Arithmetic | APPortion- (1870 to Ratio II (1900 to 1930 
United States 131,865 131,865 131,865 131 , 865 131,865 131,865 
Albama 2,734 2,778 2,805 2,664 2,842 2,751 _ 
Arizona 519 506 490 673 468 520 Un 
Arkansas 1,802 1,848 1,909 1,965 1,992 1,869 f 
California 8,543 7,446 6,875 6,989 6,098 7,699 j 
Colorado 1,047 1,068 1,087 1,477 1,112 1,068 
Connecticut 1,713 1,729 1,727 1,648 1,726 1,749 ( 
Delaware 234 240 247 224 256 239 
District of Col. 497 506 513 514 523 507 
Florida 2,026 1,850 1,734 1,741 1,577 1,839 
Georgia 2 ,686 2,763 2,915 2,888 3,124 2,845 
Idaho 421 433 452 646 478 447 
Illinois 8,223 8,274 8,240 7,820 8,196 8,256 
Indiana 3,283 3,347 3,403 3,099 3,478 3,314 
Iowa 2,334 2,399 2,507 2,400 2,654 2,428 
Kansas 1,836 1,882 1,940 2,083 2,020 1,877 
Kentucky 2,596 2,656 2,720 2,519 2,808 2,650 
Louisiana 2,250 2,267 2,263 2,136 2,257 2,247 
Maine 761 781 813 712 856 784 
Maryland 1,684 1,711 1,728 1,582 1,752 1,705 
Massachusetts 4,300 4,386 4,461 4,338 4,564 4,411 
Michigan 5,838 5,663 5,467 5,195 5,201 5,629 
Minnesota 2,528 2,588 2,658 2,901 2,754 2,585 
Mississippi 1,870 2,103 2,127 1,991 2,159 2,073 
Missouri 3,553 3,641 3,749 3,521 3,898 3,625 
Montana 484 498 538 76. 577 528 
Nebraska 1,345 1,379 1,421 1,727 1,480 1,376 
Nevada 98 99 98 22 98 97 
New Hampshire 449 461 477 422 500 462 
New Jersey 4,730 4,639 4,513 4,391 4,341 4,654 
New Mexico 458 457 462 455 450 
New York 13 ,965 13 ,938 13,761 12,804 13 ,520 13 ,674 
North Carolina 3,593 3,562 3,496 3,244 3,405 3,482 
North Dakota 658 675 699 1,398 731 676 
Ohio 7,029 7,105 7,119 6 ,620 7,139 7,133 
Oklahoma 2,592 2,605 2,592 3,521 2,573 2,594 
Oregon 1,063 1,059 1,044 1,174 1,024 1,051 
Pennsylvania 9,758 9,950 10,116 9,732 10,344 10,008 
Rhode Island 717 727 732 712 738 723 
South Carolina 1,650 1,695 1,768 1,727 1,867 1,738 
South Dakota 692 707 723 1,108 744 702 
Tennessee 2,685 2,732 2,765 2,532 2,810 2,703 
Texas 6,654 6,581 6,443 6,791 6,256 6,545 
Utah 526 534 539 580 545 531 
Vermont 337 347 363 316 386 339 
Virginia 2,333 2,395 2,482 2,334 2,601 2,427 
Washington 1,651 1,669 1,673 2,545 1,679 1,647 
West Virginia 1,871 1,880 1,871 1,846 1,857 1,887 
Wisconsin 3,010 3,063 3,102 2,980 3,157 3,080 
Wyoming 240 242 242 316 242 241 

















* With migration. 
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APPENDIX TABLE A.—Continued 
(In thousands. Each figure has been independently rounded) 









































Adjusted to census count 
ia Cohort-survival Ratio III 
!) — = Geo- | Arith- | Appor- pe | Ration | “M30” 
— With No metric metic |tionment 1930) (1930) ened. 
| migration] migration fied) 
United States 131,669 | 131,669 | 131,669 |131,669 |131,669 |131,669 |131,669 | 131,669 
Alabama 2,797 3,014 2,730 2,774 2,802 2 ,660 2,838 2,747 - 
Arizona 512 489 518 505 488 672 467 519 
Arkansas 1,895 2,106 1,800 1,846 1,908 1,962 1,989 1,866 
California 6,798 5,790 8,530 7,435 6,849 6,978 6,089 7,687 
Colorado 1,080 1,100 1,046 1,067 1,086 1,475 1,111 1,067 
Connecticut 1,723 1,676 1,711 1,726 1,725 1,646 1,723 1,746 
Delaware 248 248 234 239 246 224 256 239 
District of Col. 512 486 496 505 513 514 522 506 
Florida 1,714 1,552 2,023 1,847 1,729 1,738 1,575 1,836 
Georgia 2,911 3,297 2 ,682 2,758 2,915 2,884 3,119 2,841 
Idaho 454 502 421 432 452 645 477 446 
Illinois 8,165 7,906 8,211 8,262 8,227 7,808 8,183 8,243 
Indiana 3,400 3,387 3,278 3,342 3,399 3,094 3,473 3,309 
Iowa 2,493 2,640 2,330 2,395 2,506 2,396 2,650 2,425 
Kansas 1,937 2,028 1,833 1,879 1,939 2,080 2,017 1,874 
Kentucky 2,728 2,945 2,592 2,652 2,718 2,515 2,804 2,646 
Louisiana 2,262 2,349 2,246 2,264 2,259 2,133 2,254 2,244 
Maine 822 849 760 780 813 711 855 783 
Maryland 1,727 1,704 1,681 1,708 1,726 1,580 1,750 1,702 
Massachusetts 4,452 4,383 4,294 4,379 4,457 4,332 4,557 4,404 
Michigan 5,542 5,212 5,829 5,655 5,454 5,188 5,193 5,621 
Minnesota 2 ,643 2,758 2,524 2,584 2 ,656 2,897 2,750 2,581 
Mississippi 2,134 2,285 1,867 2,100 2,124 1,988 2,155 2,070 
Missouri 3,704 3,802 3,547 3,635 3,747 3,516 3,892 3,620 
Montana 527 586 484 497 538 764 577 527 
Nebraska 1,413 1,496 1,343 1,377 1,420 1,725 1,478 1,374 
Nevada 92 95 98 99 98 92 98 97 
New Hampshire 475 478 448 460 477 421 499 462 
New Jersey 4,481 4,204 4,723 4,632 4,503 4,385 4,334 4,648 
New Mexico 465 494 455 458 456 461 454 450 
New York 13 ,689 12,945 13,944 | 13,918 | 13,735 | 12,785 | 13,500 13 ,653 
North Carolina 3,750 3,670 3,587 3,557 3,489 3,239 3,400 3,477 
North Dakota 701 787 657 674 699 1,396 730 675 
Ohio 7,097 6,932 7,019 7,095 7,109 6,610 7,128 7,123 
Oklahoma 2,616 2,776 2,588 | 2,602 | 2,588} 3,516 | 2,570 2,590 
Oregon 1,027 975 1,061 1,058 1,043 1,172 1,023 1,050 
Pennsylvania 10,071 10,169 9,744 9,935 | 10,106 | 9,717 | 10,329 9,993 
Rhode Island 731 715 716 726 731 711 737 722 
South Carolina 1,778 1,993 1,647 1,693 1,767 1,725 1,865 1,736 
South Dakota 727 780 691 706 722 1,106 743 701 
Tennessee 2,750 2,972 2,681 2,728 2,762 2,528 2,806 2,699 
Texas 6,460 6,484 6,644 6,571 6 ,430 6,781 6,247 6,535 
Utah 558 582 525 533 538 579 545 530. 
Vermont 364 378 337 346 363 316 386 339 
Virginia 2,491 2,673 2,330 2,391 2,481 2,331 2,597 2,423 
Washington 1,656 1,615 1,649 1,667 1,671 2,541 1,677 1,644 
Weat Virginia 1,910 1,972 1,868 1,878 1,867 1,843 1,854 1,884 
Wisconsin 3,118 3,184 3,005 3,058 3,099 2,976 3,152 3,075 
Wyoming 247 250 239 242 242 316 242 241 
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APPENDIX TABLE A.—Continued 
(In thousands. Each figure has been independently rounded) 








Unadjusted to national total 





Cohort-survival Ratio III (1900 to 
1930 modified) 








T&W na- 
tional pro- 
jection* 





United States 151,116 
Alabama 
Arizona 
Arkansas 
California 
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Delaware 
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Adjusted to T&W national projection* 


























State Apporti RatioI | Ratio II yong 
‘ " ‘ pportion io C) 

Geometric | Arithnatic nent (1870 to (1930) 1930 
1930) modified) 
United States 138 ,442 138 ,442 138 ,442 138 ,442 138,442 138 ,442 
Alabama 2,725 2,863 2,920 2,617 2,984 2,809 
Arizona 596 562 529 914 491 590 
Arkansas 1,690 1,822 1,948 2,021 2,091 1,862 
California 12,400 8,931 7,742 8,030 6,402 9,604 
Colorado 1,022 1,085 1,124 1,883 1,168 1,082 
Connecticut 1,762 1,817 1,815 1,647 1,812 1,850 
Delaware 221 238 252 222 269 237 
District of Col. 489 517 532 526 549 516 
Florida 2,697 2,167 1,927 1,938 1,656 2,164 
Georgia 2,393 2,602 2,920 2,824 3,280 2,761 
Idaho 385 418 457 831 502 443 
Illinois 8,549 8,751 8 ,682 7,822 8,604 8,689 
Indiana 3,211 3,406 3,521 2,949 3,652 3,339 
Iowa 2,126 2,308 2,532 2,298 2,786 2,374 
Kansas 1,729 1,862 1,984 2,215 2,121 1,856 
Kentucky 2,487 2,662 2,796 2,395 2,948 2,653 
Louisiana 2,323 2,389 2,380 2,132 2,370 ~2 ,344 
Maine 700 758 824 609 899 756 
Maryland 1,678 1,762 1,798 1,523 1,840 1,741 
Massachusetts 4,198 4,457 4,614 4,319 4,792 4,496 
Michigan 6,789 6,327 5,919 5,385 5,460 6,267 
Minnesota 2,404 2,580 2,726 3,129 2,891 2,577 
Mississippi 1,678 2,162 2,211 1,952 2,266 2,104 
Missouri 3,355 3,609 3,836 3,392 4,092 3,589 
Montana 457 457 538 983 606 514 
Nebraska 1,266 1,364 1,453 1,994 1,554 1,362 
Nevada 102 104 104 97 103 101 
New Hampshire 418 451 486 388 525 455 
New Jersey 5,341 5,117 4,854 4,596 4,557 5,145 
New Mexico 473 484 481 485 477 467 
New York 14,944 14,979 14,609 12,751 14,194 14,440 
North Carolina 8,927 3,870 3,731 3,240 3,575 3,698 
North Dakota 614 663 712 2,354 768 666 
Ohio 7,170 7,432 7,461 6,479 7,495 7,460 
Oklahoma 2,705 2,762 2,733 4,624 2,702 2,732 
Oregon 1,142 1,141 1,110 1,357 1,075 1,116 
Pennsylvania 9,537 10,121 10 ,467 9 ,636 10,860 10,209 
Rhode Island 721 754 764 720 775 745 
South Carolina 1,510 1,638 1,789 1,689 1,961 1,716 
South Dakota 667 712 745 1,551 781 703 
Tennessee 2,659 2,803 2,872 2,437 2,950 2,747 
Texas 7,332 7,177 6,890 7,490 6,568 7,096 
Utah 526 552 561 623 573 544 
Vermont 305 331 366 277 406 321 
Virginia 2,168 2,343 2,525 2,229 2,731 2,399 
Washington 1,682 1,745 1,753 3,627 1,763 1,687 
West Virginia 1,953 1,993 1,973 1,897 1,950 1,993 
Wisconsin 2,973 3,138 3,221 2,949 3,314 3,161 
Wyoming 246 254 254 401 254 251 





* With migration. 
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Adjusted to census count 
Cohort-survival Ratio III 
athte om Geo- Arith- | Appor- ea, me Ratio II — 
With mi- | No mi- metric | metic /|tionment 1930) (1930) modi- 
gration | gration fied) 

United States 151,116 {151,116 |151,116 |151,116 [151,116 {151,116 |151,116 {151,116 
Alabama 3,188 3,696 2,975 3,125 3,141 2,856 3,257 3 ,066 
Arizona 633 586 650 613 604 997 536 644 
Arkansas 2,101 2,562 1,844 1,989 2,024 2,206 2,283 2,032 
California 8,319 6,228 | 13,536 9,749 9,412 8,765 6,988 | 10,483 
Colorado 1,213 1,256 1,115 1,185 1,195 2,055 1,275 1,181 
Connecticut 1,980 1,867 1,924 1,983 1,982 1,798 1,978 2,019 
Delaware 277 276 242 260 264 242 293 258 
District of Col. 570 509 534 565 569 574 599 563 
Florida 2,078 1,754 2,944 2,366 2,298 2,116 1,807 2,362 
Georgia 3,180 4,017 2,612 2,840 2,930 3,083 3,580 3,014 
Idaho 508 609 420 456 467 907 548 483 
Illinois 9,288 8,680 9,331 9,552 9,532 8,538 9,392 9,484 
Indiana 3,844 3,800 3,504 3,718 3,750 3,219 3,986 3,645 
Iowa 2,731 3,031 2,321 2,519 2,582 2,509 3,041 2,591 
Kansas 2,154 2,338 1,887 2,032 2 ,066 2,418 2,315 2,026 
Kentucky 3,112 3,589 2,714 2,906 2,943 2,614 3,218 2,896 
Louisiana 2,615 2,815 2,536 2,607 2,605 2,327 2,587 2,558 
Maine 918 978 764 828 846 665 981 836 
Maryland 1,965 1,905 1,829 1,923 1,933 1,662 2,008 1,901 
Massachusetts 5,018 4,824 4,582 4,865 4,909 4,715 5,231 4,907 
Michigan 6,702 5,968 7,411 6,906 6,791 5,878 5,960 6,840 
Minnesota 2,940 3,174 2,624 2,817 2,857 3,415 3,156 2,813 
Mississippi 2,444 2,783 1,831 2,360 2,374 2,131 2,474 2,297 
Missouri 4,072 4,256 3,662 3,940 4,003 3,702 4,467 3,918 
Montana 564 679 498 499 538 1,073 662 561 
Nebraska 1,565 1,735 1,382 1,488 1,513 2,176 1,696 1,487 
Nevada 105 105 111 114 114 106 112 110 
New Hampshire 529 532 456 492 502 423 573 497 
New Jersey 5,254 4,640 5,830 5,586 5,511 5,017 4,974 5,616 
New Mexico 557 615 516 529 528 529 521 510 
New York 15,716 | 14,071 | 16,313 | 16,350 | 16,244 | 13,918 | 15,494 | 15,762 
North Carolina 4,314 4,598 4,287 4,224 4,185 3,536 3,902 4,037 
North Dakota 782 971 670 723 737 2,569 838 727 
Ohio 8,114 7,718 7,827 8,112 8,119 7,072 8,181 8,143 
Oklahoma 3,053 3,416 2,953 3,015 3,006 5,047 2,949 2,982 
Oregon 1,171 1,061 1,247 1,245 1,237 1,481 1,174 1,218 
Pennsylvania 11,326 | 11,534 | 10,411 | 11,047 | 11,144 | 10,518 | 11,855 | 11,143 
Rhode Island 835 794 787 823 825 786 846 813 
South Carolina 1,989 2,481 1,638 1,788 1,830 1,844 2,140 1,873 
South Dakota 826 940 728 777 786 1,692 853 768 
Tennessee 3,114 3,513 2,902 3,060 3,079 2,660 3,221 2,999 
Texas 7,613 7,690 8,004 7,834 7,752 8,175 7,169 7,746 
Utah 666 717 574 602 605 680 625 593 
Vermont 402 430 333 362 372 302 443 350 
Virginia 2,786 3,197 2,366 2,558 2,609 2,433 2,981 2,619 
Washington 1,869 1,767 1,836 1,904 1,907 3,959 1,924 1,842 
Weat Virginia 2,279 2,424 2,131 2,176 2,170 2,070 2,128 2,176 
Wisconsin 3,551 3,691 3,245 3,426 3,448 3,219 3,617 3,451 
Wyoming 287 293 268 277 277 438 278 274 
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FACTORS IN INTERPRETING MORTALITY 
AFTER RETIREMENT 


Rosert J. Myers 
Social Security Administration 


Currently there is considerable discussion as to the effect of 
compulsory retirement on the national economy and on the 
vitality and longevity of the individuals concerned. Some 
experiences would seem to indicate that retirement causes 
higher mortality than is standard for the ages concerned. Fre- 
quently, however, such conclusions are not warranted because 
the individuals who do retire under voluntary provisions tend 
to be those who are in poor health. When retirement is com- 
pulsory, such experience as is available does not indicate high 
mortality but this is probably, at least in part, due to the fact 
that many quite healthy workers are among the retired group 
in contrast with the situation under plans having voluntary 
retirement. There is no conclusive data currently on hand to 
indicate for a given group of individuals what the effect of 
retirement on mortality really is depending upon whether 
similar groups of individuals could retire or could continue 
working. 


vantages of continuing individuals in employment beyond age 65 
rather than having compulsory retirement at that age, as is the case 
in many retirement plans. Such advantages are said to accrue both to 
the individual involved and to the nation. 

One of the advantages frequently claimed insofar as the individual 
is concerned is that a person compelled to retire loses his vitality and 
thus tends to die much earlier than if allowed to continue in gainful 
employment. This runs contrary to the viewpoint frequently ex- 
pressed many years ago that workers were being compelled to remain 
at work because there was no pension plan to take care of them so 
that their inevitable end was death from exhaustion. Instead, it was 
advocated that such workers should be allowed to spend their declining 
years in peace and leisure while supported by a pension. 

Currently, there are about 15,000 pension plans supplementary to 
the social security program. Many of these, following to some extent 
previous employer practice, provide for a compulsory retirement age 
(often at 65). In the majority of plans, retirement may be deferred with 
the consent of the employer. That retirement at age 65 is by no means 
universal is indicated by the fact that the average retirement age 
under the Old-Age and Survivors Insurance program is currently 69 
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for men and somewhat over 68 for women (in 1940—50 it was generally 
about one year higher). 

This paper will examine the question as to the effect of retirement on 
mortality. Before proceeding further, let me issue the warning that no 
clear and definite conclusions will be or can be drawn because there are 
so many conflicting factors involved. 

Unfortunately, specific and reliable data on this subject are not avail- 
able. The analysis is complicated by the question as to whether people 
retire because they are disabled and are thus subject to high mortality, 
or on the other hand whether the high mortality is the result of retire- 
ment. Data as to the mortality of retired persons will be examined for 
several governmental retirement systems and for a few non-govern- 
mental pension plans in an effort to throw some light on this matter. 


EXPERIENCE TO BE EXPECTED IN VARIOUS TYPES OF PLANS 


Before proceeding to such actual data as are available, it will be 
worthwhile to examine briefly the effect that the particular provisions 
of the plan might have on the resulting experience. This is an ex- 
tremely important factor because completely different results may 
be obtained for what is essentially the same underlying mortality—all 
depending on the structure of the benefits provided and the admin- 
istrative procedure adopted. 

In considering various possible hypothetical pension plans, let it first 
be assumed that mortality is not affected by retirement. Then we shall 
be able to see that any indications of lower or higher mortality following 
retirement arise solely from the particular plan and its provisions. 

First, consider a plan which has no benefits payable before age 65— 
either early age retirements or disability retirements—but which has 
compulsory retirement at age 65 and which pays an annuity beginning 
at age 65 to those who previously left service because of disability. 
Under this plan, mortality after age 65 would, for the entire retired 
group, be fairly comparable with that previous to age 65, or with what 
might be termed the “general level.”’ Of course, as between those who 
were in active service when they attained age 65 and those disabled 
persons previously separated from service who receive an annuity at 
age 65, the former would experience lower mortality. 

Second, consider what the situation would be if the previous plan 
did not have compulsory retirement at age 65. For ages shortly after 
age 65, it is likely that the mortality experience would be higher than 
the general level because there would be a tendency for the less healthy 
lives to retire at or shortly after age 65 and for the healthier lives to 
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continue at work. After age 70, the mortality experience of the total 
retired would approach the general level of mortality because virtually 
everybody would have retired by then. 

Third, consider the case where disability pensions are provided (or 
where disabled persons receive no vested rights for a pension at age 
65). If retirement is compulsory at age 65, the experience for non- 
disabled retired workers will show definitely lower mortality than the 
general level at the ages shortly after age 65 but eventually will merge 
into the general level. If retirement is not compulsory at age 65, the 
resulting mortality experience will probably be somewhat higher than 
the general level at the ages just beyond age 65 and not as high as for 
the group of disabled pensioners. 

Fourth, consider the experience under a plan which permits optional 
retirement before age 65. There is a subdivision between disability 
pensioners and others (as there well might be because of a differential 
in benefit amount favoring the former). The disability pensioners will 
experience quite high mortality, while the other pensioners, at least 
for a few years, will experience very low mortality. This latter group 
would undoubtedly obtain the larger disability pensions if possible 
and therefore must be considered to be quite select medically. 


EXPERIENCE UNDER OLD-AGE AND SURVIVORS INSURANCE PROGRAM 


The old-age and survivors insurance program covers some 80% of 
the paid civilian jobs in the country. In its actual operation, a vast 
amount of valuable mortality experience has been accumulated. Un- 
fortunately, it has not been possible to tabulate and analyze all of this 
vast store of information, especially in regard to mortality data strati- 
fied by duration of retirement. 

In the early 1940’s a brief investigation was made as to select mor- 
tality by age and duration of retirement. This indicated that, as con- 
trasted with general population mortality, a person who has just re- 
tired has about 15% higher mortality. But this differential rapidly 
diminishes until after two or three years it has virtually disappeared. 

More recently it has been possible to make an investigation as to the 
over-all mortality experience of retired workers—but only by attained 
age and not with regard to duration of retirement. This experience is 
summarized in Table 1. For men there is very notable excess mortality 
at ages 65 and 66, but this differential gradually decreases for the older 
ages. This gives some indication of the higher mortality immediately 
after retirement. The effect thereof is diluted at the older ages as most 
of the experience is among continued lives rather than newly retired 
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ones. For women the same general tendency appears to be present al- 
though to a much smaller degree. The mortality of male retired 
workers at ages 75 and over very closely parallels population mortality, 
but for women the retired workers have 10-15% lower mortality at 
these ages and even at and shortly after age 65 the mortality is very 
close to that of the general population. 

The old-age and survivors insurance data clearly indicate that con- 


TABLE 1 


RATIOS OF ACTUAL TO EXPECTED DEATHS AMONG RETIRED 
WORKERS* UNDER OLD-AGE AND SURVIVORS INSURANCE 
SYSTEM, 1950-52> 











Age Men Women 
65 136% 90% 
66 145 107 
67 128 99 
68 121 98 
69 116 94 
70 115 90 
71 111 90 
72 107 87 
73 106 85 
74 105 85 

75-79 99 82 
80-84 98 85 
85-89 101 90 

90 and Over 103 100 
All Ages 109 90 





* Actually, includes all persons who claimed benefits even though some returned to work. 
> Expected deaths based on U.S. 1950 White Lives Mortality Tables. Actual deaths: men 367,000; 
women 42,000. 


siderably higher mortality than standard arises for individuals who 
have just retired, but this differential gradually reduces. This is par- 
ticularly the case for men, although there is some indication of it also 
being present for women. 


EXPERIENCE UNDER THE RAILROAD RETIREMENT PROGRAM 


The railroad retirement program covering some 1} million workers 
may be said to be a combination of an industry-wide private pension 
plan and a social insurance system since it contains elements of both. 
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In its actual operation, a very considerable amount of valuable mor- 
tality experience has been accumulated. In fact, it is the only large 
public retirement system for which good mortality data are available 
according to duration of retirement. 

Table 2 compares the actuarial rates used in cost valuations for 
mortality of active workers and retired persons for ages 65 and 70. 
These have been tested against actual experience to a certain extent. 
According to these figures it is not expected that the mortality of these 
two groups will differ greatly at those ages where most retirements oc- 
cur. 


TABLE 2 


TABULAR MALE MORTALITY RATES USED IN RAILROAD 
RETIREMENT SYSTEM®* 


(per thousand) 








Ratio of Active 





Age Active Service Age Retirements to Retired 
65 30.2 30.2 100% 
70 45.4 50.2 91 





* Source: “Retirement Policies and the Railroad Retirement System,” Part 1, Senate Report No. 
6, 83rd Congress, 1st Session, pp. 341 and 357. 


Table 3 compares the ratio of actual to expected deaths among age 
annuitants during a recent 3-year period. The characteristics of this 
plan are such that individuals may retire before age 65, with larger 
benefits if permanent and total disability is proved than if the retire- 
ment is for “age,” and under certain circumstances retirement can be 
for “occupational” disability. 

The mortality for age retirements at ages 60—64 is as much as 25% 
below the expected level during the early years of retirement although 
ultimately the mortality of this group approaches very close to that 
of the life table used as the basis of determining expected deaths. On 
the other hand, for those retiring at ages 65-69, actual mortality is ap- 
preciably higher than expected mortality—particularly in the first two 
years of retirement. This could be anticipated because those reaching 
age 65 who are in better health tend to continue at work and conversely 
those in poorer health retire. Those retiring exactly at age 65, relatively 
do not show as much excess mortality in the first few years of retire- 
ment as those retiring at ages 66-68. For those retiring at ages 70 and 
over, the mortality experience is quite close to that expected and 
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shows no significant fluctuation with duration of retirement. This 
might well be expected because age 70 is by employer practice virtually 
a compulsory age on most railroads. Accordingly this group is a good 
cross-section of persons of those ages—although perhaps somewhat 
healthier because they have been in employment up to that age. 


TABLE 3 


RATIO OF ACTUAL TO EXPECTED DEATHS AMONG RAILROAD 
RETIREMENT AGE ANNUITANTS, BY DURATION OF 
RETIREMENT, 1947-50* 








Duration of Retirement (Years) 





Age at 
Retirements» 





65 112% 110% 99% 95% 89% 
66 135 118 115 99 105 
67 123 114 109 116 124 
68 141 132 110 109 105 
69 111 110 95 107 97 


60-64 74 87 75 78 96 
65-69 121 114 104 101 99 107 
70 and Over 103 93 103 101 104 104 


All Ages 114 107 102 100 100 106 





* Based on data furnished by Office of Director of Research, Railroad Retirement Board. Such data 
in summary form are contained in Table A-2, Annual Report of the Railroad Retirement Board for the 
Fiscal Year Ended June 30, 1952 (but shown there by attained age rather than age at retirement). 
Expected deaths based on 1944 Railway Annuitants Mortality Table, set back 1 year in age. Actual 
deaths: 25,545. 

> Age last birthday. 


Consideration of the railroad retirement data, in view of the specific 
provisions of that program, indicates quite clearly that the mortality 
of those who retire at and after age 65 is relatively high in the first 
few years of retirement. There is no conclusive evidence that this 
higher mortality is due to the act of retiring, but rather it seems prob- 
able that the retirements were to some extent caused by ill health 
which would have produced higher mortality anyhow. 


EXPERIENCE UNDER CIVIL SERVICE RETIREMENT SYSTEM 


The civil service retirement system covers some 13? million em- 
ployees of the Federal Government and so is, in effect, a large self- 
administered pension plan. In general, depending upon length of serv- 
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ice, age retirement on full annuity can occur at ages 60 or 62. Prior to 
then, in certain cases, both disability and age retirement benefits are 
available, but the latter are in a reduced amount so that any disabled 
person would attempt to have his retirement adjudicated as due to 
disability. 

Table 4 indicates the difference between the actuarial rates used 
in valuation of the system for the mortality of persons in active service 
and those who have retired on account of age. These two sets of figures 
should not be considered as reflecting the actual experience but rather 
give some indication of what is expected from an actuarial standpoint. 
For ages 60 to 70, the mortality of persons in active service is indicated 
to be some 30-50% lower than for persons who have retired on account 
of age. 


TABLE 4 


TABULAR MALE MORTALITY RATES USED IN CIVIL 
SERVICE RETIREMENT SYSTEM* 


(per thousand) 








Ratio of Active 





Age Active Service Age Retirements Pg ae 
60 14.1 20.8 68% 
65 17.6 30.9 57 

70 21.5 46.5 46 





* Source: Tables 27 and 31, 22nd Annual Report of the Board of Actuaries of the Civil Service Re- 
tirement and Disability Fund for the Fiscal Year Ended June 30, 1942. 


Unfortunately, select data according to duration of retirement are 
not available for this system. Table 5, however, does show the ratio of 
actual to expected deaths by attained age for age retirements during 
a recent 3-year period. For men, the mortality experience under age 60 
which is in respect to individuals who voluntarily retired on a reduced 
annuity and thus apparently could not prove disability was relatively 
low, just as was the case in the railroad retirement data. For attained 
ages 60-66, mortality is definitely higher than that according to the 
standard table, while at the older ages, the two tend to come together. 
Since this is an aggregate experience for all ages of retirement com- 
bined, it would be expected that this would occur at least after age 70, 
which is the compulsory retirement age. For women, the same general 
trends are evident except that there are greater fluctuations in the 
mortality ratios due to the smaller number of persons involved and 
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except that the mortality ratios for ages under 80 tend to show actual 
mortality well below that expected. This is not a significant factor in 
the experience as to the effect of retirement, but rather indicates that 
the standard table in use for women has too high mortality rates. 
The experience under the civil service retirement program seems to 


TABLE 5 


RATIOS OF ACTUAL TO EXPECTED DEATHS AMONG CIVIL 
SERVICE RETIREMENT NON-DISABILITY ANNUITANTS, 
FISCAL YEARS 1950-52 








Age Men é; Women 





Under 60 99% 40% 

60 121 94 

61 109 65 

62 119 84 

63 113 79 

64 120 61 

65 119 79 

66 113 69 

67 103 79 

68 110 83 

69 102 66 
70-74 94 74 
75-79 95 84 
80-84 97 100 
85-89 92 98 
90 and Over 93 106 


All Ages 99 81 





* Based on data furnished by Retirement Section, U. 8. Civil Service Commission. Expected 
deaths based on tabular rates shown in Table 4. Actual deaths: men—16,307; women—1,561. 


confirm, in general, that of the railroad retirement system. Mortality 
is definitely lower than standard for those retiring at age retirements 
prior to the normal age and is definitely higher for those retiring at the 
normal age and a few years later. Again, this seems to indicate that 
the higher mortality shortly after retirement at or after the normal 
age is in considerable part due to the fact that ill health tended to 
cause retirement rather than vice versa. 


EXPERIENCE UNDER PRIVATE PLANS 


For many years insurance companies have collected experience 
under the group annuity plans which they sell primarily to commer- 





INTERPRETING MORTALITY AFTER RETIREMENT 507 


cial and industrial concerns. In general, the annuities are payable 
beginning at age 65 regardless of whether the individual retires at that 
age, although in actual fact he may not receive the payment. Two sub- 
divisions possible in the group annuity data are for “normal” retire- 
ments (generally payable from age 65 on) and “early’’ retirements, 
which in many—if not most—cases are disability retirements. As would 
be anticipated, the mortality under the “early” retirements is very high, 
especially at ages prior to 65, but subsequently tends to come closer 
to the mortality for the “normal” retirements (see Table 6). On the 


TABLE 6 


RATIO OF ACTUAL TO EXPECTED DEATHS AMONG MALE 
SERVICE PENSIONERS IN THREE SELF-ADMINISTERED 
PRIVATE PENSION PLANS* 








Group Annuity 
(1946-50) Plan A> Plan Be Plan C4 


Age (1943-52) (1946-51) (1946-51) 





“Normal” “Early” 





Under 55 e 312% 465% 

55-59 152% 248 280 

60-64 96 197 166 

65-69 102 149 116 

70-74 112 136 118 102 

75-79 119 137 123 

80-84 137 113 131 

85-89 124 e 120 129 
90 and Over 127 e 128 e 110 


All Ages 109 174 130 111 105 





® Source: Report of Special Committee on Experience under Self-Administered Retirement Plans, 
Transactions of the Society of Actuaries, 1953 Reports. Expected deaths based on 1937 Standard Annuity 
Mortality Table. Actual deaths: Plan A—5,316; Plan B—613; Plan C—1,672. 

> Group of public utilities covered under uniform plan. 

© Electric utility company. 

4 Large company in electrical manufacturing industry. 

° Insufficient data. 


other hand, for the “normal” retirements, mortality shortly after age 
65 tends to be somewhat low since payments generally begin automati- 
cally at age 65 and are thus made to quite healthy lives since a consid- 
erable number of disabled lives have already been eliminated as a 
result of the “early” retirements. 

There has recently become available the first results of a continuing 
study by a committee of the Society of Actuaries in regard to the 
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mortality experience under self-administered retirement plans. As dis- 
cussed previously, the resulting experience must be considered very 
carefully in view of the fact that the particular provisions in each plan 
will materially affect the results. 

Table 6 compares the actual and expected deaths among male serv- 
ice pensioners under three privately administered pension plans. It 
should be noted that the mortality table used as a basis of the expected 
deaths is significantly too low at the extreme ages (beyond age 80) 
so that the mortality ratios developing tend to be artificially high. 

Plan A has compulsory retirement at age 65 but has disability pen- 
sions prior to that age, which are included in this experience. Accord- 
ingly, as would be expected, there are very high mortality ratios prior 
to age 65, while those after age 65 tend to be somewhat above the group 
annuity “normal” retirement experience, at least between ages 65 and 
75. 

Both Plans B and C have compulsory retirement at age 65 and have 
separate disability benefits before age 65, the experience of which is not 
included here. As a result for ages 65 to 75, these plans show very low 
mortality since those entering the experience upon compulsory retire- 
ment at age 65 tend on the whole to be quite healthy lives. Certainly 
this latter experience would, of itself, not seem to give any indication 
that compulsory retirement produces high mortality. 


SUMMARY 


The preceding analyses of the mortality experience under various 
governmental and private pension programs indicate quite clearly that, 
in the absence of any special circumstances, the mortality of retired 
workers during the first year or two of retirement is considerably above 
the general level which otherwise might be expected but thereafter 
merges with such general level. It seems likely that this higher mortality 
in the early years of retirement arises from the fact those in poorer 
health are more apt to retire at or shortly after the minimum retirement 
age, while the healthier individuals continue at work. 

An important factor to consider is that those retiring under a plan 
which does not have compulsory retirement generally tend to be the 
less healthy lives. On the other hand, in a plan providing for com- 
pulsory retirement at a particular age, those still in service at that age 
generally tend to be somewhat healthier than the normal population 
since they have recently been at work. Thus, it would be completely 
erroneous to contrast the mortality under a plan with compulsory 
retirement and that under a plan with voluntary retirement if there 
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were considered only pensioners. The results would seem to indicate 
lower mortality for the compulsory plan, which would not be a valid 
conclusion. It would really be necessary to contrast the mortality of 
pensioners under the compulsory retirement plan with that of both 
active employees and pensioners under the voluntary retirement plan. 
No such data were available to the author since usually mortality of 
active employees is not as closely studied as that for retired persons, 
particularly in governmentally administered plans. But if any progress 
is to be made in exploration of the subject of mortality after retirement, 
it will be necessary to obtain such data.! 

The preceding discussion does not, however, mean that compulsory 
retirement might not have a serious effect on an individual’s health 
and vitality, especially if he had not adjusted himself to the separation 
from employment. Unfortunately currently available data do not meas- 
ure the effect of retirement on mortality after retirement. A priori 
reasoning would seem to indicate that compulsory retirement would 
certainly have some deleterious effect on mortality for some persons. 





1 The Department of Sociology and Anthropology of Cornell University is currently conducting a 
longitudinal study on the effect of retirement on mortality and morbidity, as between retirants and 
non-retirants under plans having different provisions as to retirement policy. For a description of this 
study see Milton L. Barron, Gordon Streib, and Edward A. Suchman, “Research on the Social Dis- 
organization of Retirement,” American Sociological Review, 17 (1952). 








SAMPLING CONTROL OF LITERACY DATA* 


8. S. ZaRKovic 
Federal Bureau of Statistics, Belgrade, Yugoslavia 


An attempt is described to control the value of literacy data 
by the use of sampling methods. The reason for this research 
is the widely known unreliability of this sort of statistics in 
countries with a high rate of illiteracy. This research has been 
conducted as a part of the post-enumeration survey, taken in 
connection with the Yugoslav census of population as of 
March 31, 1954. The aim of the survey was the control of ac- 
curacy and value of different census results. The value of 
literacy data was checked on the sample of individuals by 
means of reading and writing tests. The results show (2) that 
literacy is a continuous variable, and (ti) the unreliable char- 
acter of literacy statistics is connected with the difficulty of 
defining the limit between the different levels of literacy. Since 
these limits cannot be defined in the census of population, the 
best method to check the value of litera .y data seems to be the 
use of sampling methods. 


THE PROBLEM 


ATA on literacy are usually obtained in the census of population. 
Each person over a given age is asked about his ability to read and 
write. 
What is the value of data provided in this way? 





* The author wishes to express his indebtedness to Mr. 8S. Krasovec, formerly director of the Fed- 
eral Statistical Office, and to Mr. M. Macura, director of the Serbian Statistical Office, who spent a lot of 
energy to make this research possible. The research described in this paper belongs to the new field in 
statistics that could be labeled “The problem of the value of statistical data.” The most important 
work in connection with this problem has been done in the USA and India. So far obtained results and 
experiences can be found in the following papers: M. H. Hansen, W. N. Hurwits, E. 8. Marks, W. E. 
Mauldin: Response errors in surveys, Journal of the American Statistical Association, 46 (1951), 147-90, 
P. C. Mahalanobis: Recent experiments in statistical sampling in the Indian Statistical Institute, 
Journal of the Royal Statistical Society, 109; P. V. Sukhatme, G. R. Seth: Measurement of non-sampling 
errors, Journal Indian Society Agriculture, Vol. 4; P. V. Sukhatme: Measurement of observational 
errors in surveys, Revue del'Institute Internationale Statistique, Vol. 20; M. H. Hansen, W. N. Hurwitz, 
L. Pritzker: The accuracy of census results, American Sociological Review, 1953; W. E. Deming: On 
errors in surveys, American Sociological Review, Vol. 9; E. 8. Marks, W. P. Mauldin: Problems of re- 
sponse in enumerative surveys, American Sociological Review, Vol. 15; E. 8. Marks, W. P. Mauldin, 
A. Nisselson: A case history in survey design; The post-enumeration survey of the 1950 census, Journal 
of the American Statistical Association, 48 (1953), 220-43; G. L. Palmer: Factors in the variability of 
response in enumerative studies, Journal of the American Statistical Association, 38 (1943), 143-52; 
8. S. Zarkovié: Completeness of enumeration (in Serbian), Federal Statistical Office, Belgrade, 1954; 
A. Gosh: Accuracy of family budget data with reference to period of recall, Calcutta Stat. Assoc. Bul., 
Vol. 5; M. H. Hansen, W. N. Hurwitz, W. G. Madow: Sample Survey Methods and Theory, New York, 
Wiley, 1953; W. E. Deming: Some Theory of Sampling, New York, Wiley, 1950; 8. 8. Zarkovié: Popula- 
tion Census Errors (in Serbian), Federal Statistical Office, Belgrade, 1954. 
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The doubtful reliability of this sort of statistics is well known for it 
is clear that the answers are to a large extent the result of personal 
opinion of what literacy is. In Population Census Methods! one reads: 
“The meaning of the data on literacy and illiteracy obtained in a 
population census depends obviously to an important degree upon the 
extent of reading and writing ability that is assumed by the enumer- 
ators and respondents to be required for an affirmative answer.” 

To illustrate the unreliable character of these data we shall mention 
two examples. 

In a European country with a very low percentage of illiterates, it 
was noticed after the mobilization during the last war that the per- 
centage of those unable to read and write was far higher than was 
found during the preceding census. The situation among women was 
still worse. It is obvious the literacy situation, as depicted by the 
census statistics, is rather vague. 

The next example is from Yugoslavia. The successive censuses gave 
the following percentages of illiterates: 


1921—50.5 
1931—44.6 
1948—25.4 
1953—24.9? 


From this it appears that in the first 10 years illiteracy decreased by 
6 per cent and in the next 17 years by 19 per cent in spite of the fact 
the schools were practically closed during the war. But in the last five 
years, during the regular work of all schools, with a very expanded sys- 
tem of education, including a great number of courses for teaching the 
alphabets and compulsory learning of reading and writing for those in 
the military service, illiteracy remained on the same level. There is 
something in this situation that deviates from a logical pattern. 

On the basis of this figure for 1948, estimated with good reasons as 
being optimistic, while preparing for the census of 1953, we put in our 
program an investigation of the value of answers given by respondents 
on all census questions and consequently on the question of literacy 
as well. 

This research proceeded in two directions. First, a sample of indi- 
viduals was drawn immediately after the census, each of whom was 
requested by a specially trained inspector to answer again all census 
questions. Here the interest was in the stability of answers obtained in 





1 United Nations, 1949, p. 83. 
2 This figure is a sample estimate. 
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the census and in the extent of errors in them. The second research, 
also based on the sample of individuals, had as its aim to check, by 
means of tests, the degree of literacy of those who declared in the 
census that they were literate. 


DATA ON THE SAMPLE® 


To facilitate the organization of the census the whole country was 
divided in 118,999 enumeration districts (e.d.) with an average of 142 
people, ranging in size from 0 to 300. For the purposes of sampling 
these e.d. were stratified in two strata, urban and rural. The urban 
stratum consisted of 29,805 e.d. and the rural one of 89,194. Before 
the beginning of the census only the total number of e.d. was known 
in each administrative unit and their distribution in strata. On this 
basis a random sample of 149 e.d. was drawn in the rural stratum and 
100 in the urban one. Each e.d. was assigned equal probability. 

In order to get individuals, subsampling was used. In the research 
on the stability of answers in the urban stratum the sampling fraction 
of 1:10 was applied to the total number of the enumerated people in 
an e.d. In the rural stratum the sampling fraction was 1:8. In this way 
a total of 1,682 people were investigated in the urban stratum and 
2,470 in the rural one. 

This gives the situation in Table 1. 


TABLE I 
SOME DATA ON THE SAMPLE 








Primary units J 
(e.d.) in the Secondary units Number of 


sample (people) people in 
in the 
Stratum Number of sample 
people in | Percentage of 
Number | Percentage | the sample of the secondary 
of primary total units 
units 








19.818 0.163 1,682 
16.785 0.347 2,470 
36 ,603 0.216 4,152 























* Detailed information on this sample is given in 8. 8. Zarkovich: Estimating Census Figures (in 
Serbian), series “Studies and Analyses,” No. 1, Federal Statistical Office, Belgrade 1953. 
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The selection of secondary units was arrived at on the basis of census 
questionnaires, concentrated at that time in the commune office. 

The task of inspectors was to find the people drawn into the sample 
and, using special control questions and available documents, to at- 
tempt to get the right answer on all census questions. In this work 
inspectors didn’t know the answers given by respondents during the 
census. 

These are data on the sample designed for the purpose of the general 
control of all the answers on census questions. 

For the second research only those persons have been taken into 
account who: i) were, at the beginning of the census, 10 years of age 
and over, ii) put the answer “reads” and “writes” in the census ques- 
tionnaire, iii) had an education of 4 years of elementary school or less. 
Those having more education were considered as definitely literate. 

For this program the same sample of primary units was used as in 
Table 1. Since the definition of the population now was changed, the 
census returns were used again to select a new sample of secondary 
units. In this selection the sampling fraction of 1:6 in the rural stratum 
and 1:8 in the urban one was applied. So the sample consisted of 417 
people in the urban stratum and 1,022 in the rural one. 

In addition, another small sample was drawn of those who declared 
themselves illiterate. The purpose of this sample was to check whether 
this group of the population was homogeneously illiterate. 

The individuals selected in this way came into a school where a test 
of their ability to read and to write was administered. Reading was 
investigated by means of 15 tests,‘ having each some printed phrases 
and three control questions that had to be answered by marking the 
right answer. Each right answer represented one point. The maximum 
number of points was 45. The testing of each group was limited to 10 
minutes. 

The ability to write was tested in a similar way. Our inspector dic- 
tated three phrases that had to be written in a limited time. 


ANALYSIS OF ERRORS 


When the field work was completed the control forms were matched 
against the census forms and the cases were defined as errors when the 
answers were not identical. So in connection with literacy 3.2 per cent 
wrong answers were found in the rural stratum and 2.8 per cent in the 
urban one. These percentages are calculated on the basis of total 





4 These tests were prepared in the Department of Psychology, University of Belgrade, by B. 
Stevanovich, N. Rot, Z. Vasich and M. Jovichich. 
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number of enumerated people in the e.d. All the present errors do not 
represent, however, the changed datum on literacy. Only 59.0 per cent 
(36 people) out of total number of errors in the urban stratum and 65.4 
(53 people) in the rural stratum represent the changes in the answer 
previously given. The remainder of errors covers the omissions of 
answers for the people of 10 years and over or giving answers for chil- 
dren under 10 years. 

The next problem is as follows: is there any tendency among individ- 
uals to declare themselves literate when they are not so or to declare 
themselves illiterate when really literate? Lettirz the first type of 
tendency be represented by + and the second by —, we found the 
following distribution of changed data: 














Males Females 
Stratum 
+ - + - 
Urban 3 3 16 14 
Rural 5 8 15 25 





These figures do not agree with what is rather general opinion, viz. 
the existence of a marked tendency among individuals to declare them- 
selves literate even if not so. If there is a place here for any tendency it 
should be stated in just the opposite way (in accordance with the 
results for the rural stratum). 

This conclusion might appear somewhat hazardous since both 
minuses and pluses may conceivably represent errors in the second 
report. But we think it is safe within the best possibilities of the check- 
ing procedure in this field. Each changed answer was subject to a spe- 
cial investigation.’ So only some intermediate cases (vide infra) might 
represent the problem. 


SOURCE OF ERRORS 


Now we face the very important practical question of the source of 
these errors. 

The information given by the respondents with the wrong census 
answers shows that two main sources of errors exist. In the first group 
the extreme cases are included, namely individuals being absolutely 
literate or illiterate. In the second group the intermediate cases are 
involved. 





§ The description of the checking procedure is given in 8. 8. Zarkovich: Population Census Errors. 
Belgrade 1954. 
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By the definition of the group, in the first case the answer on the 
question of literacy is known. If someone never learned reading and 
writing or if a person had a college education it is clear what the 
answer should be. But the errors still appear. For absolute illiterates 
we found answers “literate” and vice versa. 

In our census the source of these errors is the system of enumeration. 
In our system the questionnaires were distributed one day before the 
beginning of the census and collected the day afterwards. Meanwhile 
everybody was supposed to fill answers personally (if literate). For 
children and illiterate people the giving of answers was the duty of 
parents or some other member of the family. The enumerator had to 
check and correct data given or to put down answers in the case when 
no one was able to do it (in villages). 

Now, if a member of a family fills the questionnaires for the others 
he may not always be well informed on what the answer should be. 
It particularly holds for the people on the lower cultural level where 
no attention is paid to literacy. If the enumerators do this job on the 
basis of information given by the head of the family the errors appear 
in the same way. It would be the best if the enumerators had a separate 
talk with each respondent. In this case the number of errors would 
probably be less serious. 

Consequently, these errors can be influenced primarily by changing 
the system of enumeration (if there is any possibility to do so). The 
recommendation in Population Census Methods by which the “general 
adoption of the criterion . . . ability to read and write a simple mes- 
sage in any language, would help to improve the comparability and 
meaningfulness of census statistics on this subject” does not seem to 
be useful. 

In the group of intermediate cases the errors appear because the per- 
son in the low level of literacy declares himself literate and vice versa. 
Here the respondents don’t know what their answers should be like. 
To what degree should the ability to read and write be developed to 
entitle either of two possible answers? The problem is the limit that 
divides literacy from illiteracy. Considering the fact this limit can 
only be defined in terms of some units it is obvious that no system of 
enumeration is likely to change the frequency of errors. It also seems 
that the above recommendation in Population Census Methods couldn’t 
be expected to be helpful. We found a lot of people able to distinguish 
any letter and read any word but the reading represented a tremendous 
effort for them in which they used 20 times more time than a man with 
a university education. From the point of view of the “ability to read 
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and write a simple message” they are literate but from any practical 
point of view they are illiterate. In general, such an individual does 
not use at all his ability to read and write because this is for him as 
painful a job as any other in which great physical efforts are con- 
cerned. The limit of such a literacy is the illiteracy. 


MEASURING DEGREE OF LITERACY 


Literacy is a continuous variable with illiteracy and complete 
literacy on its extremes. Any intermediate value brings about the 
question whether it should be called literacy or not. The difficulties 
appear because of the very nature of literacy as a census charac- 
teristic. If it is desired to have a more precise insight in what the 
literacy of the “literate” people really means, there is no other way— 
as it seems to me—than to draw a sample of “literate” individuals, 
apply some measuring and, on the basis of the results received, esti- 
mate the percentage of the people on different levels of literacy. 

This was the aim of our second research. In connection with this the 
following had to be done: 

%) prepare tests for measuring, 

47) define the limits between different classes of literacy in terms of 
units of these tests. 

tit) apply these tests and calculate the percentage of people in each 
class. 

The results received are shown here in Tables 2 and 3. In the stubs 
of these tables is the reading score and in the headings, the writing 
score. 

These figures show the existence of correlation between the ability 
to read and the ability to write. Then, one sees that there is a percent- 
age of people with a pretty high score in reading and a low one in 
writing (the first two columns). They also show that among the people 
with moderate or high scores in writing there is a percentage of those 
(the first row) who did not understand what they read. These have 
no points in reading. 

At the same time these data show that even in the class of the literate 
people (because in this research only those have been included who 
declared themselves “literate” in the census) there is a number of those 
who didn’t get any points in either reading or writing (first column, 
first row). These are illiterate. Most of these cases were separately in- 
vestigated and their illiteracy was proved. It was found that their 
presence in the class of literate people was due to the system of enu- 
meration: those giving answers for them put them in this group. 

On the basis of these tables, the possibility of estimating the real 
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TABLE 3 


DISTRIBUTION OF SCORES IN READING AND WRITING 
(Rural stratum) 
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number of literates by combining the scores both in reading and writing 
is obvious. Because of the small sample, our problem was the trichoto- 
mous classification. Experiments to give the respective values of limits 
are now in process. 
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But another illustration of the limits can be given here. Taking into 
consideration only the errors in reading, it was found that the man 
with a university education takes on the average 6 minutes to get 
through all tests. In doing so he generally makes no errors and gets 
45 points. In other words, one point takes approximately 8 seconds. 
In the same way the score of 5 points in 10 minutes means an average 
of 120 seconds per point or 15 times slower reading than a man with 
university education. Adopting now the criterion that the score of 15 
or more times slower reading than our standard, i.e. 0-5 points, repre- 
sents the class “illiteracy,” then that 5-15 points or 15-5 times slower 
reading defines the “middle literacy” and 15 points and over the class 
“literacy,” our data give the following distribution: 








Degree of Stratum 
literacy 





Rural Urban 





Illiteracy , 
Middle literacy 21. 
Literacy 71. 


27. 
64. 





To complete these data Figure 1 is also given. On the z-axis is the 
score in reading and on the y-axis the percentage of persons having 
reached the respective score. This figure is characteristic for the 
problem of literacy. 

Consequently, the results of such research can be used to enlarge 
the knowledge of what is behind the general term “literate” people. 


CONCLUSIONS 


Data on literacy, as obtained by the census, probably are not very 
reliable in any country, although the degree of their value may vary 
considerably in connection with specific conditions. The possibility 
of influencing their reliability by means of better definition is also 
doubtful. 

There are two main problems in this field: i) the value of the answers 
of those absolutely literate and absolutely illiterate and ii) the quality 
of answers of intermediate cases, i.e. of the people who are neither 
completely literate nor illiterate. In the first case the right answer 
depends mostly upon the system of enumeration and in the second 
one upon the lack of a criterion as to the level of ability to write and 
read at which literacy begins. To control the meaningfulness of data 
belonging to this second class, an experiment of the described sort is 
very useful. 
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Some may agree to the usefulness of such a control but doubt 
might arise as to the possibility of carrying through a similar large- 
scale research. The problem may especially arise in connection with 
the willingness of the people to be tested. 

Perhaps some words on our experience will be useful here. To carry 
on this experiment we used 250 young employees of the statistical 
office who had been trained two to three hours a day during less than 
two weeks. They were charged with the whole field work in the appli- 
cation of sampling methods in connection with this census. Most of 
them had never had any contact with psychology and education, but 
in spite of that their technique of experimentation was considered by 
experts as very satisfactory. 

On the other hand, we did not have any difficulties with people. 
Before these experiments started, our inspectors contacted respond- 
ents in connection with the control of the completeness of enumeration 
and the control of all answers in the census questionnaire. At this occa- 
sion they also had a contact with those people selected for this experi- 
ment and explained to them the purpose and the sense of this work. 
The result was that out of 1439 primarily selected persons only 14 
didn’t come to the testing place. They have been replaced by the others 
selected at random as well. 

If this experience has any meaning for other countries, I should con- 
clude that the sampling control of the literacy data does not raise 
serious difficulties. 





RESPONSE ERRORS IN ESTIMATING 
THE VALUE OF HOMES* 


Lesiie Kisx and Joun B. LANsING 
Survey Research Center, University of Michigan 


In the 1950 Survey of Consumer Finances home owners 
were asked to estimate the market value of their houses. Es- 
timates for these same homes were later made by professional 
appraisers. These two estimates for each of 568 homes com- 
prise the data analyzed here. The proportion of discrepancies 
between the two estimates is great: only 37 per cent of the 
estimates by respondents are within plus or minus 10 per cent 
of the appraisers’ estimates. However, the errors tend to be 
offsetting, and in none of the ten price classes used is the differ- 
ence in the relative frequencies for owners and appraisers sta- 
tistically significant. Similarly, although the root-mean-square 
difference between the two measurements is high (an average 
of $3,100), the mean of the respondents’ estimates is only 
$350 higher than the mean of $9,200 for the appraisers’ es- 
timates. The amount of variability is found to be rather simi- 
lar for several sub-populations. However, for houses worth 
over $10,000 the mean-square difference between the measure- 
ments is found to increase with the value of the home. In the 
Appendix a model is developed for the statistical investigation 
of the data. 


INTRODUCTION 


NOWLEDGE Of the over-all financial position of consumers has been a 
primary objective of the Survey of Consumer Finarices conducted 
annually since 1945 by the Board of Governors of the Federal Reserve 
System in cooperation with the Survey Research Center of the Uni- 
versity of Michigan. 

More than half of American families live in their own homes, and 
for the vast majority of these families, that home is their most valu- 
able single asset. To be complete, then, any analysis of the financial 
position of consumers must cover this asset. 

In the 1950 Survey of Consumer Finances, respondents were asked 
to give their idea of what their house was worth.! The answers they 





* The authors are indebted to Clarke L. Fauver of the staff of the Federal Reserve Board, who 
initiated the research reported here while he was on the staff of the Board's division of Research and 
Statistics. They are also indebted to the American Institute of Real Estate Appraisers, the Federal 
Housing Administration, and the Society of Residential Appraisers for their participation in the field 
work. 

1 For a discussion of the methods used in this survey see G. Katona, L. Kish, J. B. Lansing and 
J. K. Dent, “Methods of the Surveys of Consumer Finances,” Federal Reserve Bulletin, 36 (1950), pp. 
795-809. 
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gave have been tabulated and on the basis of those replies tables were 
published in the Federal Reserve Bulletin? showing distributions in 
class intervals of owners’ estimates of the current value of their homes. 
The published distributions are for all owners, for owners having differ- 
ent incomes, owners with different occupations, and owners living in 
towns and cities of different sizes. This is important basic information 
for the student of housing economics. 

The question naturally arises, how reliable are these data? How 
much does the average householder know about the going market price 
for his house? Assuming for the moment that he does know, is his 
answer to the interviewer’s question likely to be seriously biased? Are 
recent buyers of homes more informed about current market conditions 
than owners who may have bought many years earlier? 

It was in an effort to answer some of these questions that a special 
attempt was made to evaluate the responses given to questions con- 
cerning house values in the 1950 Survey of Consumer Finances. Re- 
spondents who reported they owned their own homes were asked in 
January and February 1950 whether they had purchased their homes 
in 1949 or in some earlier year. Those who had purchased before 1949 
were asked: “Could you tell me what the present value of this house is? 
I mean about what would it bring if you sold it today?” (A similar 
question was asked in the 1950 Census of Housing.) Those who had 
purchased their homes during the year 1949 were asked: “How much 
did the house and lot cost?” 

Subsequent to the completion of meeurvtantiens it was decided to 
check the estimates of respondents by obtaining estimates from quali- 
fied residential appraisers. Through the cooperation of the American 
Institute of Real Estate Appraisers, the Federal Housing Administra- 
tion, and the Society of Residential Appraisers, arrangements were 
made to have professional appraisers visit a substantial number of the 
properties. The appraisers were not required to obtain access to the 
property ; they were asked to look at it from the outside and to estimate 
its value in the light of their experience and familiarity with local real 
estate conditions. 

From the sample of home owners found in the yearly survey a sub- 
sample was selected, including respondents who failed to answer the 
questions about home-ownership, but not including any potential re- 
spondents who had not been interviewed during the regular survey. 





2J. A. Frechtling, J. H. Lorie and Irving Schweiger, 1950 Survey of Consumer Finances, Part V, 
The Distribution of Assets, Liabilities, and Net Worth of Consumers, Early 1950,” Federal Reserve 
Bulletin, 36 (1950), pp. 1595-97. 
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(In the subselection a higher probability of selection was given to 
the more extreme house values.) The sample was distributed roughly 
evenly among the three participating organizations. The response rate 
in the follow-up study was 89 per cent. This high response rate, a result 
of the excellent cooperation on the part of these professional groups, 
made possible the analysis which follows. 

The number of homes used in the first stage of the analysis is the 637 
for which forms were returned. In 30 of the 637 cases the value of the 
property was not indicated on the completed form. In an additional 39 
cases the respondent failed to give a usable answer to the question in 
the original survey. Hence there are 568 homes for which two estimates 
of value are available. (In calculating the response rate of 89 per cent 
mentioned above these 69 cases were treated as responses since some 
useful information is available about them. If the 69 were classified as 
non-responses, the response rate would be 79 per cent.) 

Essentially, the analysis was divided into two stages. The first stage 
involved a simple comparison of the frequency distributions and cross 
tabulations obtained by the original survey and by the follow-up study. 
The second stage involved the statement of a mathematical model of 
the response error, and estimates of the terms of the basic equation of 
this model. Although the conclusions drawn in the second stage are 
described in the main body of this article, the model itself appears 
in the Appendix. 


COMPARISONS OF CELL FREQUENCIES 


The first step in the analysis was to compare the frequency distribu- 
tion obtained from the survey of owners with that from the survey of 
appraisers. The results of the comparison appear as the first two colums 
of Table I. Columns (3) and (4) show cumulative totals for columns 
(1) and (2), respectively. 

The fifth column of Table I shows the distribution of appraisers’ 
estimates for the 39 cases which were “Not Ascertained” in the survey. 
On seeing the “NA’s” in any table one is led to wonder about their 
effect on the entire distribution. There is a mere suggestion of a con- 
centration of several homes with very low values among these 39 cases. 
But anyone who assumed that the 39 cases should be distributed pro- 
portionally would not have been led far astray. 

In column (6) the differences between the entries of columns (1) and 
(2) are given. These differences are subject to sampling variability. 
If we sent out both interviewers and appraisers to repeated samples of 
600 cases under identical conditions, we would expect that the bracket 
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distributions sometimes would show closer agreement than (1) and (2), 
and sometimes wider disagreement. The model in the Appendix per- 
mits us to estimate the probability that the proportions in any pair of 
cells will agree within a given range. That is, we can estimate how the 
differences shown in column (6) would fiuctuate if the present study 
were repeated many times. The measures of this fluctuation, estimated 


TABLE I 


FREQUENCY DISTRIBUTIONS OF THE VALUE OF OWNER- 
OCCUPIED HOMES BASED ON ESTIMATES REPORTED BY 
OWNERS AND APPRAISERS (UNCORRECTED)* 


(percentage distribution of homes) 














. Appraisers’ 
Cumulative/(,, nulative| Estimates | Difference | Sandard 
Respond- _ 5 Toaied Error 
- Appraisers Total of | Where Re-| Between 
Value of Home ents . Respond- ey , of the 
7 Estimates 2 Appraisers’| spondents’|Proportions} _. 
Estimates enta a Difference 
Sathinties Estimates | Were Not | (1)-(2) in (6) 
Ascertained| 
(1) (2) (3) (4) (5) (6) (7) 
Under $2,500 2.9 2.3 2.9 2.3 14 +0.6% 0.7% 
$2,500— 4,999 13.1 13.7 16.0 16.0 14 —0.6% 1.4% 
$5,000— 7,499 19.6 19.3 35.6 35.3 20 +0.3% 1.9% 
$7,500— 9,999 21.5 24.3 57.1 59.6 18 —2.8% 1.9% 
$10,000-12,499 19.1 16.8 76.2 76.4 7 +2.3% 1.8% 
$12,500-14,999 6.5 8.8 82.7 85.2 10 —2.3% 1.2% 
$15,000-19,999 7.2 6.3 89.9 91.5 3 +0.9% 1.1% 
$20,000-29,999 2.8 2.2 92.7 93.7 3 +0.6% 0.7% 
$30,000 and over 1.5 1.4 94.2 95.1 3 +0.1% 0.4% 
Value not ascer- 
tained 5.6 4.7 99.8 99.8 8 +0.9% 1.2% 
Total 99.8? 99.8> 100 
Number ofhomes|} 637 637 637 637 39 





























® These “uncorrected” distributions contain clerical errors which were discovered and corrected 
in the course of comparing the data from respondents and appraisers. Later tables are based on correct- 
ed data except as indicated. 

> Detail does not add to 100.0% owing to rounding. 


from the data of this study, are presented in column (7) in terms of the 
standard errors of the differences. We may illustrate the interpretation 
of these columns as follows: the discrepancy between the proportion of 
homes placed in the bracket $2,500-4,999 by respondents and ap- 
praisers was 0.6 per cent in the present study; if the study were repeated 
many times, this difference would be less than 1.4 per cent in two stud- 
ies out of three in the long run, and it would be less than 2.8 per cent in 
19 studies out of 20. 
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The two distributions in (1) and (2) convey the same general impres- 
sion about the proportion of owner-occupied homes of different values. 
The same would be true of other similar distributions from replications 
of the present study in view of the relatively small size of the errors 
shown in (7). We make this judgment (and ask the reader to do like- 
wise) within the general framework of the errors and requirements of 
surveys of this kind and size. It would be fruitless for us to raise here 
the question: for what kind of decisions are our results “reliable 
enough”? Our investigations do provide assurances against the exist- 
ence in the procedures of large response errors. “Large” here is taken in 
the context of the actual sizes of the sample and of the sampling 
errors—but we must neglect the question of the relative cost of reducing 
the response error. 

Although we find no reliable evidence of a net bias in any price class, 
it is possible (and even probable) that a large enough sample would un- 
cover biases which escape detection in this sample. We have shown 
only that the differences between columns 1 and 2 could be the result 
of random response variation. : 

The second step in the analysis was to examine the discrepancies 
between the estimates of respondents and appraisers. The similarity 
between the first two columns in Table I could be the result either of 
few errors or of many off-setting errors. Table II compares the classifi- 
cation of the homes by respondents and appraisers. A sum of the pro- 
portions in the cells along the diagonal indicates that 43 per cent of the 
homes that were included in a given bracket by the respondents were 
also placed in that bracket by the appraisers. Errors were, in fact, 
frequent, but generally off-setting. 

On close examination some of the differences shown in Table II 
seemed out of all reason—how could any house valued by a respondent 
at under $2,500 be valued by an appraiser at over $15,000? This ques- 
tion raises the possibility of errors in the survey process made by others 
than respondents and appraisers. The information in Table II was used 
to guide a special search for errors. All cases where the two estimates 
were in disagreement by more than one “bracket” (coded class of 
house value) up to a value of $15,000, and above that value all cases 
not in the same “bracket,” were selected for study. 

This search involved a comparison of the original interview, the 
appraiser’s report, and the card on which the data had been punched. 
Such a search is unlikely to turn up errors by interviewers in recording 
the answers given by respondents, but it should disclose any errors in 
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coding. An examination of 109 cases yielded 17 errors, all but two of 
them clerical errors by coders. Of the 17, four involved only errors in 
the conversion of a dollar amount (entered correctly) to a bracket 
(entered wrongly). There were 11 clerical errors made in coding the 
respondents’ estimates, and two errors were made by interviewers. 
Ten of these 11 errors involved entries of one-tenth of the proper 
amount owing to the omission of a zero; in the one case, $11,000 was 
read as $77,000. In addition to these errors two exceptional cases were 


TABLE II 


RELATION BETWEEN APPRAISER’S ESTIMATE AND 
RESPONDENT’S ESTIMATE (UNCORRECTED)* 


(percentage distribution of homes) 








Respondent’s Estimate 





$2,500 $5,000 $7,500 $10,000 $12,500 $15,000 $20,000 $30,000 
$2,500 -4,999 -7,499 -9,999 -12,499 -14,999 -19,999 -20,909 & Over 





Under $2,500 ; 0.4 0.2> 
$2,500- 4,999 E 7.2 3.8 
$5,000- 7,499 3.4 8.3 
$7,500- 9,999 3 0.5 5.0 
$10,000-12, 499 q 0.3 
$12, 500-14, 999 ’ 0.7 
$15, 000-19, 999 L ae 
$20,000-29, 999 \o.2| 

$30,000 and over Eo 0.2> 
Value not ascer- 
tained 0.2 1.4 1.3 0.4 0.5 0.2 0.1 





Total 2.9 13.1 19.6 21.6 19.1 6.5 7.2 2.8 1.5 , 99.8 
Number of cases® 80 108 120 105 39 50 47 26 39 637 








® For the difference between corrected and uncorrected data, see the discussion in the text. The principal effect of 
the corrections on this table is to empty the cells indicated by boxes, distributing the entries among other occupied 
cells. The table reads as follows: 1.0% of all houses in the sample were valued at under $2,500 by the respondent and 
also by the appraiser; 0.4% of all houses were valued at $2,500-$4,999 by the respondent, but at under $2,500 by the 
appraiser; etc. 

> These two cells contain one case apiece. They are the exceptional cases noted in the text. 

© Because there were three different weighte used, the percentages are not simple ratios of the total of 637. 


noted. In one, it seemed clear that the appraiser had included only part 
of the property which the respondent had in mind. In the other, the 
appraiser based his estimate on the commercial value of the property, 
while the respondent based his on the value for residential purposes. 
The effect of these errors on the entries of a few cells in Table IT are 
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shown: certain cells which are emptied by the corrections or which 
contain only exceptional cases, have been indicated by being enclosed 
in “boxes.” All of the most extreme discrepancies in Table II disap- 
pear but the marginal distributions are little changed by the correc- 
tions. It is interesting to note that the lowest class was composed in 
large part of errors.® 

The comparison in Table II is supplemented by another approach 
in Table III: this presents the distribution of each respondent’s esti- 


TABLE III 


FREQUENCY DISTRIBUTION OF RESPONDENT’S ESTIMATE 
DIVIDED BY APPRAISER’S (IN BRACKETS)* 








Respondent’s Estimate Divided 


by Appraiser’s Proportion of Homes 








Under 70% 6 
70— 89% 20 
90-109% 37 

110-129% 19 

130-149% 9 

150% and over 9 

Total 100 
Number of homes 568 





® Uncorrected for the clerical errors discovered only after comparison of the two estimates. 


mate divided by the appraiser’s estimate on his home. This division 
was carried out for 568 homes. The respondents’ estimates were within 
plus or minus 10 per cent of the appraisers’ in 37 per cent of the 
cases. On the other hand, the discrepancy was more than plus or minus 
30 per cent for 24 per cent. Of these 24 per cent, 18 per cent repre- 
sent overestimates by respondents, suggesting a tendency for owners to 
overvalue their homes. This possibility can be better evaluated by 
comparing the means of the two distributions (after correction of the 
clerical errors). 





3 This finding and the $77,000 mistake have obvious implications for checking procedures. It 
should be noted that the checking procedure used in processing the 1950 Survey of Consumer Finances 
varied according to the nature and extent of the projected analysis of the data. The data on value of 
houses received the minimum amount of checking. For the type of distribution actually published in 
the Federal Reserve Bulletin these clerical errors were of little importance. The clerical errors do have 
a large effect on the errors of the estimated mean value; however, the mean was neither intended nor 
submitted for publication. 
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COMPARISONS OF THE MEANS OF THE TWO DISTRIBUTIONS‘ 


The difference between the means obtained by the two methods of 
measurements is $9,560 —$9,210 = $350. That is: the mean of $9,560 
obtained from the responses of the home-owners seems to include a 
bias of $350 (if we accept the appraisers’ values as “true”). This bias 
is in the direction one would expect. The standard error of the differ- 
ence was calculated (by a formula proper to the complexities of the 
sample design) to be $170. Hence there appears to be a tendency (sta- 
tistically significant) for the home-owners to set higher values on their 
homes than do the professional appraisers. This tendency is small 
compared to the value of the home—about 4 per cent of the latter. 

This net average bias may appear small also in comparison with the 
large discrepancies found in the two values obtained for individual 
homes. The mean square difference of the two measurements was esti- 
mated as 9,580,000 compared to the estimate of the squared bias of 
100,000 (see equation 15 in Appendix). This result is consistent with 
the findings presented in Tables I and II which show also large dis- 
crepancies in individual estimates but small differences in the overall 
distribution of the two measurements. 

The relative importance of a bias depends on the size of the survey 
to be taken. The sample mean of a simple random sample of n inter- 
views with respondents may be expected to be subject to a total 
root-mean-square error of \/[V(r)/n]+D? where the first term under 
the radical represents the total variability of the estimates from the 
survey of respondents about their own mean and the second, the square 
of the bias.’ As the size of the sample increases the first term will de- 
crease but the second will remain constant. 

We may use the sample estimates obtained in our investigation to 
examine the effect of the bias on the total error. For V(r) we have the 
estimate v(r)=32,650,000; and for D* we have the estimate d? 
= 100,000 (see equation 15 in Appendix). Now let us take the value of 
V[V(r)/n]+D? for three different sample sizes, and under the two 
assumptions: that D?= 100,000 and that D?=0. 

4 We include this analysis because it may be of general interest. We repeat: the mean home value 
was not sought nor published in the original survey. 

5’ The term V(r) /n represents the variance of the sample mean as it is usually calculated, but it 
actually includes both the error “esulting because not every member of the universe was interviewed— 
the sampling error proper—and any uncorrelated random response error which may be present in 
the methods used, such as random clerical errors (see equation 7 in Appendix). The net average of the 
response errors will be reflected in the squared bias term, D*. This expression shows that it is possible 
to increase the accuracy of the estimate of a mean from a simple random sample in one of three ways: 
by increasing the number of interviews (increasing n); by reducing the variability of the estimates, 
V(r), (by reducing some of the errors of response); or by reducing the size of the bias D? (for example, 


by more careful training of interviewers). The practical problem in the administration of surveys is to 
allocate resources among the three in such a way as to minimize the total error for a given outlay. 
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The total (root-mean-square) error of the mean house value under 
six different conditions would then be as follows: 


Value if Sample Size (n) is 
Value of D* Formula 





100 1,000 10,000 


(1) a9 nto i / 32,650,000 100,000 | $650 $360 $320 
? n 


(2) Total error if 32,650,000 $570 $180 $60 
D?=0 n 


Note that the total error for a sample of 100 is not greatly increased 
by the bias term, but, for a sample of 1,000, the effect of the bias term 
is large, while for a sample of 10,000 it is overwhelming. Where the 
facts are similar to those found in this investigation, an improvement 
in the accuracy of the estimate for surveys of a few hundred cases can 
probably be obtained most easily by increasing the size of the sample. 
For surveys of several thousand cases, however, it may be more effi- 
cient to allocate funds for a search for sources of bias and for develop- 
ment of techniques for reducing the bias than to allocate funds for an 
increase of the size of the sample.® 

We have investigated the possibility that the discrepancy between 
the two measurements might prove to be a function of the value of the 
house. One can imagine, for example, that respondents might tend to 
overvalue low priced homes and to undervalue high priced homes. We 
divided the homes into groups based on the appraisers’ value and esti- 
mated the mean value of the houses in each group, first on the basis 
of the respondents’ values for the houses and then on the basis of the 
appraisers’. The difference between these two means has been plotted 
in Chart 1. (See solid line “A.”) This graph indicates that the respond- 
ents tend to overvalue homes priced below about $12,000. For homes 
priced above that amount no clear tendency to under- or over-valuation 
appears. We feel that the discrepancy below $12,000 may be explained 
in part, but only in part, by errors in the estimates made by appraisers. 
Any such errors also would tend to give this graph a general trend 
downward to the right.’ 

















* From the data of Table IV of the Appendix it seems that the contributions of the bias term are 
less for estimates of the proportions of houses falling into designated class intervals. 

7 For example, suppose the respondent estimated the value of this house correctly at $11,000- 
The point on the chart to correspond would be X =$11,000, Y =0, if the appraiser made the same es. 
timate. However, if the appraiser made an estimate of $10,000, the corresponding point would be 
X =$10,000, Y = $1,000, which is above and to the left of the original point. An error by the appraiser 
in the other direction would lead to a point below and to the right of the original point. 
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THE DIFFERENCES OF TWO MEASUREMENTS (r-a), TAKEN AS A 
FUNCTION OF THE APPRAISER'S VALUES. 
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The values of the sample estimates are given in units of thousands 
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THE ROOT-MEAN-SQUARE DIFFERENCE 


As a measure of the average individual discrepancy between respond- 
ents and appraisers we use the root-mean-square difference, that is, 
the square root of the mean of the squared deviations between the 
pairs of estimates. We estimate this quantity at $3,100 for the sample 
as a whole. In other words if we assume that the appraiser’s estimate 
is the true value, the respondents are in error by an average of $3,100 
in their estimates. (From equation 15 in Appendix.) Actually there is 
no doubt that the appraisers also made errors, and the average dis- 
crepancy between the respondents’ estimates and “true value” of the 
property would be less than $3,100. 

How does the average discrepancy vary with the value of the home? 
Is the discrepancy a constant amount, or a constant proportion of the 
house, or some other function? On Chart I are plotted the root-mean- 
square differences—the r.m.s.(d) values—for each class of appraised 
values; the width of the intervals is $1,000 except at the ends, where 
classes were combined to obtain larger cells. (See the solid line “B.”) 
For values below $10,000 the r.m.s.(d) appears to be constant around 
$2,000. For values above $10,000 it is considerably more variable and 
larger; and, it appears to be proportional to the estimated value of the 
home. The line which represents a root-mean-square difference of one- 
fourth of the appraised value is drawn in. It appears to the eye to fit the 
distribution above $10,000 fairly well. In other words in our data the 
expected absolute value of the difference between the respondents’ and 
the appraisers’ estimates is about $2,000 for a house worth less than 
$10,000; while, for a house worth over $10,000, the expected value of 
the difference is one-fourth of the appraisers’ estimate. For a $16,000 
house, one would predict a respondent would differ from an appraiser 
by $4,000; for a $20,000 house, one would predict $5,000, and so forth. 


ANALYSIS OF SOME SUB-GROUPS 


One aim of our investigation was to discover some of the variables 
which might be associated with response errors. For three cross-tabula- 
tions comparisons were made of the ratio of respondents’ to appraisers’ 
values. An attempt was made in the original survey to isolate those 
cases where the respondent seemed uncertain of his estimate. If this 
attempt were successful, it was thought that it might be possible to 
develop methods of analysis that would place more weight on the more 
reliable cases. The procedure tried was to instruct the interviewers as 
follows: 


Since some respondents have a very clear idea of the value of their house, 
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based on such things as what the house next door just sold for, while others 
have only very vague notions, we have left space after question 31 in which 
you should note down any information he may give you about how he ar- 
rived at his estimate of the value of the house. Our objective is to distinguish 
between cases where we have the kind of accurate estimate we would prefer 
and cases where we have only vague information. In any case be sure to 
record the dollar value of the house. 


The coders were then instructed to study the answer as recorded by the 
interviewer and attempt to assign a rating according to how sure the 
respondent seemed to be of his answer. This rating proved very difficult 
to make; coders disagreed frequently as to the proper point on the scale 
at which to place an answer. The relevant data (not given here) show 
that the assigned rating of the appearance of reliability had no validity: 
the errors were about equally large in the various classes of assigned 
reliability. 

Secondly, occupation of the head of the family owning the house was 
selected as a measure of socio-economic status, on the hypothesis that 
people of higher status might be better informed. Thirdly, the popula- 
tion of the place (city) of residence of the respondent was selected on 
the hypothesis that knowledge of real estate values would be different 
in communities of different sizes. None of these hypotheses were sub- 
stantiated; no sizeable differences were noted. 

For four subgroups of the sample we calculated separately the esti- 
mates of our basic error equation (8). There exist a priori reasons why 
the accuracy of the estimates in each of these groups might turn out 
to be different than in the entire sample. The calculated equations are 
in the Appendix; here we shall summarize the results, using the root- 
mean-square difference—r.m.s.(d)—as the measure of accuracy. The 
conclusions we draw from these groups must be tempered by the knowl- 
edge that they were not properly selected subsamples of the entire 
sample; hence there may be other causes operating beyond that on 
which we focus our attention. 

a) In 65 cases the appraisers exceeded the minimum effort asked 
of them and went into the homes. We expected that their estimates 
would be more accurate, and that the r.m.s.(d) would be smaller. How- 
ever, the r.m.s.(d) for these 65 cases turned out to be $2,700 compared 
with $3,100 for the entire sample. The appraisers’ errors were not 
clearly increased by remaining outside the house. 

b) In homes purchased during the calendar year prior to the inter- 
view, the respondent was asked what he actually paid for his home. 
We expect that the reports of the respondents were fairly close ap- 
proximations to the true value at the time of purchase. The r.m.s.(d) 
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of $1,900 is reliably smaller than the $3,100 for the sample as a whole. 
One should not infer, however, that the entire $1,900 is the result of 
errors by appraisers. For one thing, real estate values change with 
time, and up to a year might elapse between the purchase and the 
original interview, with several months more passing before the visit 
of the appraiser. 

c) In the Surveys of Consumer Finances interviewers are instructed 
to make efforts to interview the head of the household rather than some 
other member. In the 91 cases where the interviews were taken from 
some other member of the household the r.m.s.(d) was not—contrary 
to expectations—larger than for the sample as a whole. In fact it was 
$2,500 as against $3,100 for the entire sample. 

d) There were 59 cases where the head of the house was a female. 
For these the r.m.s.(d) was $3,900, which appears to be reliably higher 
than for the entire sample. 

The only important improvement in accuracy, then, was for re- 
spondents who purchased in the year prior to the survey. These re- 
spondents, as noted earlier, were asked what the property actually cost 
rather than their estimate of what it might be worth, hence, it is not 
surprising that their responses are close to the appraisers’ estimates. 


APPENDIX 


The Model. The symbol r; denotes the value recorded at the 7* home 
as a response in the interview survey; and a; denotes the value assigned 
by the appraiser to the same home. The “true” (but unknown) value 
is y;. Where there is little room for misunderstanding we shall drop the 
subscript 7, and refer simply to r, a and y. The means over the entire 
population for the three sets of values may be designated by: 


R=E(r), A=E(@), Y= Ey). (1) 


The operator “E” denotes the “expected value of.”* The variances of 
the three variables may be designated by: 


V(r) = E(r — R)*, V(a) = Ea — A)*, Vy) = Ey — ¥)* (2) 





& The means of the measurements rj and a; over a finite population would be variables also due to 
the errors of measureients. But we may treat R and A as constants if we consider them as resulting 
from a large number of reported measur ts, or as 1ing from a large population. By confining 
ourselves to large populations we may also disregard any “finite population corrections” in our variance 
formulas. The terms used here are generally in accord with those in: M. H. Hansen, W. N. Hurwits, 
and W. G. Madow, Sampling Survey Methods and Theory (New York: Wiley, 1953) II, Chap. 12. 

Another good treatment of the topic of errors of response may be found in W. G. Cochran, Sampling 
Techniques, New York: Wiley, 1952, Chap. 13. 

However, none of the sources known to us develop the model we need in terms of the differences 
(r—a) of two sets of measurements, both subject to error. To what extent these non-sampling errors 
may be considered to be random variables is a complex problem which we shall have to leave untreated. 
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The quantity (r;—y;) denotes the individual error of the response in 
the interview survey for the 7** home; and (a;—y;) denotes the error 
in the appraiser’s estimate for the same home. The difference between 
the two errors is equal to the difference of the two measurements: 

d; = (r; — ys) — (4: — ys) = (7% — 4). (3) 
Furthermore, let us call the mean value (R—Y) the response bias; 
(A—Y) the appraiser’s bias; and the difference between the two biases 
is 

D = (R— A) =(R-Y)-(4-Y). (4) 


An important term in our model is the mean-square difference of the 
measurements: 
M.S8.(d) = E(d?) = E(r — a)?. (5) 
We also need the expression for the covariance between the differences 
in measurements and the appraiser’s values: 
Cov(da) = E(d — D)(a — A) = E(r —-a— R+ Aja — A) 


= E(r — a)(a — A), (6) 


also: 
Cov(da) = E(r — R — a+ A)(a — A) 
= E(r — R)(a — A) — E(a — A)? = Cov(ra) — V(a). (6a) 
With the above definitions, we may express the basic equation for 
our empirical investigations: 
V(r) + D? = V(a) + MS.(d) + 2Cov(da). (7) 
For proof express E(r— A)? in two different ways: 
E(r — A)? = E(r -R+R — A)? = V(r) + D? 
and 
(Er — A)? = E[(r — a) + (a — A)]? = MS.(d) + V(a) + 2Cov(da). 
Our model would be simpler if the appraiser gave the “true” value 
for every home, so that a;=y;; and the error equation would become 
V(r) + (R — Y)? = Vy) + E(r — y)? + 2Cov(r — y)(y). 


Here V(y) is the “true” sampling variance, i.e., the variance among the 
yi, Which are the “true” values of the homes; and 


V(r) + (R — Y)? — Vy) = E(r — y)? + 2Cov(r — y)(y). 


is the increase in the total mean-square error due to errors of measure- 
ment. Similarly, the increase in the total mean-square error, due to 
the lesser accuracy of the r; than the a;, may be measured as 
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V(r) + D? — V(a) = E(r — a)? — 2Cov(r — a (a). 
It is also interesting to note the relationship 
E(r — a)? = E(r — y)? + E(a — y)? — 2E(r — y)(a — y). 


The covariance B(r—y) (a—y) of the two measurements may be positive 
or zero, but it is not likely to be an important negative quantity in the 
present instance. Therefore, the term E(r—a)? available in this study 
is likely to be larger than the mean-square error of response E(r—y)?, 
by a quantity no greater than (but perhaps almost equal to) the mean- 
square error E(ae-y)? of the appraiser’s measurements. ® 

Although the results were obtained from a complex multi-stage sam- 
ple, the discussion is given in terms of the composition of the response 
error for the individual homes which are the ultimate elements com- 
prising the population. The expressions of the relative effects of the 
bias and of the variable error are given in terms of simple random 
samples. It is hoped that in this form the data will be of greater general 
interest and usefulness in planning other surveys. The calculations are 
based on the “naive” estimates from the pooled sample values; greater 
refinements did not seem to be warranted by the available data.!° 

The basic relationship shown in (7) may be expressed in terms of 
sample estimates as" 


v(r) + d? = v(a) + m.s.(d) + 2cov(da). (8) 
We have the following unbiased estimates: 


ra ¥y, (9) 
n 


1 n 1 n 
v(r) > (r - A), v(a) = —— }) (a — @)?, (10) 
n-—1l n—-l 
cov(ra) = : > (r — F)(a — @), (11) 


n—1 





* For the benefit of future researchers we should like to point out that an estimate of E(a—y)? 
could have been obtained had we assigned some of the homes to two appraisers each; we thought of this 
too late to carry out the necessary field work. 

10 For the same of simplicity and because we have no measure of it, we disregard the correlation 
among the errors of individual homes, such as may be caused by interviewer bias. 

11 Because the responses were “weighted” to correct for the use of different sampling rates, the 
actual sample calculations were somewhat different from those shown here. For example, r was cal- 


n n 
culated as Zw;r;/ Lwu;, where 1; is the response, and wj; is the assigned weight of the jth home in the 
sample. The calculation of the variances may be illustrated by 


n 1 n 
V(r) = a Lew; (rj; —7)2. 
n—-1l Zw; 
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cov(da) = cov(ra) — v(a) 


lo 1 
m.s.(d) eee 7, (r — a)? 


--[Or+ Da-2d val. (13) 


Although (7—4) is an unbiased estimate of D, (7 —d)? is not an unbiased 
estimate of D?; but d? is an unbiased estimate where 


d? = (F — a)? — ~ {v(r) + v(a) — 2cov(ra) J. (14) 


This hemnien because 
E(# — a)? = E[(# — R) — (@— A) + (R- A)}? 
= V(F) + V(a — 2cov(7a) + D*. 


The unbiased estimate d? has some advantages: together with the 
other unbiased estimates from the sample, it yields values for our error 
equation (8) which balance out exactly. However, it also has some 
disadvantages: it is a residual of sample values and it turns out to be 
negative sometimes—an embarrassing situation for the square of a real 
quantity. (One may decide to truncate the distribution of d? at zero 
by substituting the value zero for all negative sample estimates. Al- 
ternately, one may use simply (7—4)? with the knowledge that it has 
a positive bias of known magnitude.) 


SOME CALCULATIONS ON THE DOLLAR VALUE OF THE HOUSE 


The five terms of the basic equation (8) of the estimates of error com- 
ponents will be presented in this section for several situations. They 
will be given in units of $1,000; since in these variance components the 
units are squared, a factor of 10° is needed to convert them to plain 
dollar values. 

1) Our principal interest is in the components of the equation dealing 
with all the 568 cases: 


32.65 + .10 = 26.69 + 9.58 — 3.52. (15) 


Note the relatively large m.s.(d) term which yields the +/9.58x 10° 
= $3,100 estimate for the r.m.s.(d) between the two measurements on 
individual homes. But most of the discrepancies cancel leaving a much 
smaller net average error; the unbiased estimate of this bias is 
V.10X 10° = $320. 
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The total root-mean-square error of the responses is +/v(r) 
=+/32.65X10°=$5,700 which is not much over the /»(a) 
=+/ 26.69 X 10°= $5,200 we would get from appraisers’ estimates. 
Therefore, the practical surveyor may well be satisfied with the pre- 
cision of the interview response—if the bias term is not too large. 

The difference between the two variance terms is reduced by the 
sizeable negative covariance term (—3.52 X 10°) between the difference 
in measurements (r—a) and the appraisers’ values. This is probably due 
in part to overestimates among the appraisers’ high values and under- 
estimates among their low values. The negative covariance is in accord 
with the gentle negative slope of curve A on Chart I. 

2) If we allow the 13 gross coding errors (mentioned earlier) to 
stand uncorrected, the components are estimated as 


37.35 + .04 = 26.69 + 15.81 — 5.11. 


Thus over a third of the original m.s.(d) term of 15.81 was due to the 
13 gross errors. However, the rise in the variance of the responses is 
more moderate (32 to 37). Moreover the estimate of the sample mean 
may be no worse off for these errors (ironically) because the bias term 
seems to be somewhat reduced. It seems that all these gross errors 
were in the direction of lowering the home-owners’ estimates and, as 
noted above, home-owners suffer from a tendency to overestimate the 


value of their homes. 

The effect of these coding errors on curve (A) in Chart I is to make 
the (f—d) values for the classes above $10,000 more depressed and 
more irregular. Curve (B) of the r.m.s.(d) values is also disturbed 
above $10,000: the curve becomes more irregular and the slope becomes 
greater (it seems to fit the line of \/ E(r—a)?=a/3). 

3) For the 65 cases where the appraiser went into the home the com- 
ponents are 


37.07 — .09 = 34.31 + 7.29 — 4.62. 


4) For the 61 cases where the response was in terms of the amount 
paid for a recently purchased home we have 


23.14 — .05 = 21.76 + 3.67 — 2.34. 


If we assume that the respondents gave the “true value” of their homes 
in these cases then. we may accept this m.s.(d) term of 3.67 as a rough 
estimate of the appraisers’ contribution to the discrepancy term. 

5) For the 91 cases where the respondent was not the head of the 
household the values are 
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37.04 + .28 = 28.77 + 6.29 + 2.26. 


6) For the 59 cases where the head of the household was a female 
the equation is 


44.78 + .33 = 28.10 + 15.17 + 1.84. 


RESULTS ON PROPORTIONS 


When we deal with the proportion of cases which fall into any class 
interval our variables are binomial. The values of r; and a; are restricted 
to 0 and 1; and the value of d;=(r;—a;,) is either 0, +1 or —1. The 
basic equation (8) of the estimated error components becomes: 


n 1 
|. par] + G — Da)? —-—— (PrQr+DaGe— 2Prat 2p.) | = 
n—1l n—l 


n 2n 
|" pa. | + E +Da oa 2p. + |— (Dra ~*~ PrPa sa pa) | . (16) 
n—1l n—1 


Here p, is the proportion of the homes placed into a specific frequency 
group by the responses to the interviews, while p, is the proportion 
placed into that group by the appraisers’ estimates. Also p,. is the 
proportion placed into the same group both by respondent and by 
appraiser. Furthermore, g,=1—p, and ga=1—p,. The equation for the 
5,000—-7,499 group would be, as read from the values of Table II: 


(.196)( 804) + [ 196 — .193)*———-[(.196)(.804) 
636° 636 : 


+(.193) (.807) —2(.083) +2(.196)(.193) i 


~ (198)(.807) | +  196-+.193 -2(.083) 





r_ 637 
+. 2 a {(.088) —(.196)(.198) — (.198)(.807)} ]. 


In Table IV, columns (1) to (5), we present the estimates of the five 
components of equation (16) for each of the classes shown in Table I. 
In column (6) we show the difference (p,—pa) between the proportion 
assigned to each bracket in the surveys of respondents and appraisers. 
In column (7) we show the standard error of each difference shown in 
column (6). 
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TABLE IV 


VALUES OF THE TERMS OF THE ERROR EQUATION (16) FOR THE 
PROPORTION IN EACH OF THE FREQUENCY CLASSES 
AS SHOWN IN TABLES I AND II 








(6) (7) 
Difference | Standard 
Values of the Components of the Error Equation | Between Error of 
Proportions the 

(1) (2) (3) (4) (5) Found Difference 
v(r) d v(a) +m.s.(d) + 2cov(da)| (py—Pa) (Pr — Pa) 


Frequency Group 





+ 
i 





-0320 
-1240 
-2230 
-2360 
-2030 
-0930 
-0710 
-0340 
-0130 
-0930 


.0282 
-1140 
.1578 
-1690 
. 1548 
.0609 
-0669 
.0273 
$30 ,000 and over .0148 
Not ascertained -0530 
Two Illustrative 
Cumulated Groups: 
$0- 7,500 .2296 


-0225 
-1184 
- 1560 
- 1842 
- 1400 
-0804 
-0591 
-0216 
-0138 
-0449 


-0263 +0.6% 0.7% 
—0.6% 1.4% 
+0.3% 1.9% 
—2.8% 1.9% 
+2.3% 1.8% 
—2.3% 1.2% 
+0.9% 1.1% 
+0.6% 0.7% 
+0.1% 0.4% 
+0.9% 1.2% 


(rr itteet tt 
oounnenuenda 
+++ +++4444+ 


-2287 -2090 : +0.3% 1.8% 


-2130 — . —2.5% 1.8% 











++ 


$0-10 ,000 .2454 +. -2412 





Note that the m.s.(d) terms, denoting the variability due to the dif- 
ference of the two responses, in column 4 of Table IV are large; gener- 
ally they are as large as, or larger than, the v(r) and v(a) terms which 
ordinarily stand for sampling variability—shown in columns | and 3. 
One may be tempted to assume that this variability would be much 
less if larger groups were investigated; however, the two larger groups 
shown on the bottom two lines of Table IV, comprising respectively 
about 35 per cent and 60 per cent of the population, also have m.s.(d) 
terms almost as large as the v(r) and v(a) terms. 

In spite of the large m.s.(d) the value of [v(r)+d?] is hardly any 
larger than v(a). This is due to the large negative covariance term. 
That is: there exists a large gross response variation but its net effect 
on variability is very small. 

The net effect in terms of bias is even smaller. There is no bias term 
in column 2 which is reliable in terms of the standard error. If we aver- 
age the ratios of the d? values to the respective v(r) values over the 10 
classes we obtain .0005. In the calculations on the dollar mean the ratio 
of d? term to the v(r) term was .0030. Thus we may say that the bias 
term for the proportions remains undetected; and if it exists its effect 
on its total error is probably less than in the case of the dollar mean. 





A COMPARISON OF STRATIFIED TWO-STAGE 
SAMPLING SYSTEMS 


A. R. Sen, Uttar Pradesh, India 
R. L. ANDERSON AND A. L. Finxner, North Carolina State College 


This paper deals with an empirical investigation of various 
stratified two-stage sampling systems for estimating totals of 
certain agricultural items of North Carolina. The 1940 Agri- 
cultural Census data were used for stratification, selection and 
estimation purposes. The observed data were the results of 
the 1945 Agricultural Census. Theory for the selection of n 
primary sampling units from a stratum with probability pro- 
portional to some measure of size but without replacement 
has already been developed by the senior author [11]. The 
principal contribution in this paper is the application of this 
theory to the selection of two primary sampling units without 
replacement from a stratum, where one of the units is selected 
with probability proportional to size and the other with equal 
probability. These results are compared with sampling systems 
(t) where both units are selected with probability proportional 
to size but with replacement and (ii) where an equal number 
of primary sampling units are selected but only one from each 
stratum. 


1. INTRODUCTION 


HE theory for selecting a single primary sampling unit (p.s.u.) 

per stratum with probability proportional to size (p.p.s.) in two- 
stage designs was developed by Hansen and Hurwitz [1] in 1943 and 
was applied to human populations. The theory for selecting more than 
one p.s.u. from each stratum with p.p.s. but with replacement was 
developed by Hansen and Hurwitz [2]. Theory for selecting two p.s.u.’s 
without replacement has been developed independently by Midzuno 
[8], Horvitz and Thompson [3], Narain [9], and Sen [10]. Both Midzuno 
and Sen generalized the Hansen and Hurwitz approach to sampling 
a combination of n elements of the universe with probability propor- 
tional to some measure of size of the combination. Sen [11] further de- 
rived an expression for an unbiased estimate of the variance of the 
estimate. The theory thus developed has been applied to four items 
of the North Carolina (N.C.) agricultural population. Results of the 
investigation will be presented in this paper. 

One of the important results derived by Hansen and Hurwitz [1] was 
that selection of a p.s.u. from a stratum with p.p.s. was more efficient 
than selection with equal probability for a large class of populations. 
Their results were based on the between p.s.u. components only, the 
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within p.s.u. component being relatively small in all instances. Using 
a county as a p.s.u., Jebe [4] showed that the within p.s.u. component 
was relatively large for many agricultural populations of N.C., consider- 
ing any reasonably practicable total sample size. He recommended the 
need for investigation of the township! as a p.s.u. This aspect of the 
sampling problem is also examined in this paper. 

The values of four characteristics of the N.C. agricultural population 
have been studied. These are: 


1. Number of non-white operators 
2. Value of land and buildings 

8. Number of days worked off farms 
4. Total number of farms. 


The sources of data were: 


(a) U.S. Census of Agriculture 1945, vol. 1 part 16 (North Carolina 
and South Carolina), 

(b) The 1945 Census of Agriculture for each minor civil division 
and the sample Census of Agriculture as available in I.B.M. 
punch cards. 


For a general description of the population studied reference may be 
made to [7]. Most of the notations and terms employed in this paper 
have been used by Jebe [4]. For others, reference may be made to [8]. 

The principal objectives of this investigation were: 


(¢) to examine some applications of theory already developed for 
the selection of one p.s.u. per stratum, 
(t¢) to develop new theory for the selection of two p.s.u.’s per 
stratum, 
(tt¢) to compare these two selection procedures empirically. 


Some theoretical comparisons of the two selection procedures were 
made; however, no useful rules were found for indicating a preference 
for either procedure. 


2. SAMPLING SYSTEMS WITH ONE P.S.U. SELECTED 
FROM EACH STRATUM 


Three intensities of stratification were employed in this study, viz., 
197 strata, [6], 98 strata, and 40 strata. The stratification was based on 
data provided by the 1940 Census of Agriculture. The counties of Dare 
and Swain were omitted from the study as they had only a small num- 
ber of farms. The state was first divided into 197 strata following m.c.d. 


1 In North Carolina the township is also referred to as a minor civil division (m.c.d.). 
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lines and then 98 by combining two adjacent old strata, except that two 
of the new strata each contained about two and a half of the old strata. 
In this division care was taken to construct, as nearly as practicable, 
equal sized strata measured in terms of 1940 number of farms. Geo- 
graphic contiguity of the m.c.d.’s within a stratum was maintained. 
The 40 strata were formed by combining contiguous counties of N.C. 
These forty strata were used for sampling designs with the county and 
with the m.c.d. as the p.s.u. However, equality of size of the strata was 
not feasible in this case. 

The basic sampling design employed consists of two-stages of sam- 
pling in which 


(a) one p.s.u. (ie., a county or an m.c.d.) is selected from each 
stratum with equal probability or probability proportional to 
size, and 

(b) a constant number of sub-sampling units (s.s.u.) (except in the 
40 strata design) is selected at random from the s.s.u.’s located 
in the open country area of the p.s.u. selected in (a). 


The s.s.u.’s are area segments delineated for the Master Sample of 
Agriculture project [5]. In the 40 strata design the total number of 
s.s.u.’s specified for the state was allocated proportionally among 
strata, i.e., proportional to the total number of s.s.u.’s in the open coun- 
try portion of each stratum in 1945. 

A summary of the various designs examined is given in Table 1. 
These designs are classified into five sampling systems A, B, C, D and 
E. A sampling system consists of the sample design and the method of 
estimation. A notation for designating the sampling systems discussed 
in this paper has been adopted. For simplicity this notation is confined 
to a single stratum, as is the discussion to follow. If Q, is the function 
designating the selection probabilities to be used and Y’ is an estimator 
for the population total Y, for the characteristic of interest, then 
(2,, Y’) denotes the sampling system. For the two-stage sampling de- 
signs under consideration, where simple random sampling is always 
used in the second stage, 2, has been confined to the probabilities used 
for selecting the primary sampling units. To illustrate, if a single p.s.u. 
(say the ith) is selected and subsampled, let Y;’ be an unbiased esti- 
mate of Y; its population total. Further suppose W; is a weight func- 
tion associated with Y; such that W;Y,’ is an estimator for Y, the stra- 
tum total. If the p.s.u. is selected with probability proportional to X;, 
then the sampling system is [X;/X, W;Y;’]. The sampling systems 
A, B, C, D, and E are as follows: 
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A. (t) p.s.u. selection: p.p.s. with the number of farms in 1940 (Fo) as the 
measure of size. 
(tt) 8.8.u. selection: designs 3, 4, 9, 10, 15, and 16 in Table 1. 
(tit) estimation: ratio to the value of the characteristic in 1940 (X;) for the 
p.s.u. sampled. 
(iv) sampling system (biased): 


Fou =,’ 

Ie x]. 

Fo X; 

B. (t) p.s.u. selection: equal probability for each of the N p.s.u.’s in the stratum. 
(tt) s.s.u. selection: designs 5, 6, 11, 12, 17, and 18 in Table 1. 


(ttt) estimation: same as in A. 
(tv) sampling system (biased): 
1 Y,’ 
2, Hwy 
N X; 
C. (i) p.s.u. selection: as in A. 
(tt) s.s.u. selection: designs 3, 4, 9, 10, 15, and 16 in Table 1. 


(tit) estimation: ratio to the number of farms in 1940 for the p.s.u. sampled. 
(tv) sampling system: 
aa. “owl 
Fo Fo 
D. (i) p.s.u. selection: p.p.s. with the value of the characteristic in 1940 as a 
measure of size. 


(it) s.s.u. selection: designs 1, 2, 7, 8, 13, and 14 in Table 1. 
(tit) estimation: ratio to the value of the characteristic in 1940 for the p.s.u. 


sampled. 
(iv) sampling system: 
Xi Y,’ 
[x x 2]. 
E. (i) p.s.u. selection: equal probability. 
(tt) s.s.u. selection: designs 5, 6, 11, 12, 17, and 18 in Table 1. 
(itt) estimation: estimated p.s.u. total weighted by the number of p.s.u.’s in 


the stratum. 
1 
—, NY,’ }. 
lw ¥'] 


(iv) sampling system: 

It should be clear that the method of selecting the p.s.u. remains the 
same for each of the four characteristics observed when sampling sys- 
tems A, B, C, or E are used. This is not the case for system D, in which 
the probability of selection depends on the value of the characteristic 
in 1940 for the p.s.u. sampled. Systems A, B, C, and E, therefore, could 
be recommended for a general purpose survey, i.e. where the purpose 
of the survey is to estimate the totals of several characteristics. Sys- 
tem D, however, could be recommended only where information is de- 


Fo 7 | 
o}. 
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sired on only one characteristic or where additional characteristics ob- 

served must be subordinated in favor of a single characteristic. 
Expressions for mean square errors, i.e. the between and within com- 

ponents of variance and the bias terms, for the various sampling sys- 


TABLE 1 
SAMPLING DESIGNS FOR ONE P.S.U. PER STRATUM 








No. of s.s.u.’s 
Method of selection of p.s.u. per selected 
p.8.u. 


Sampling 
Rate % 


Design 
No. 





197 Strata 


P.P.S.-Value of characteristic in 1940 
P.P.S.-Value of characteristic in 1940 
P.P.S.-Number of farms in 1940 
P.P.S.-Number of farms in 1940 
Equal probability 

Equal probability 


tO PF DO SP LO 


98 Strata 


P.P.S.-Value of characteristic in 1940 
P.P.S.-Value of characteristic in 1940 
P.P.8.-Number of farms in 1940 
P.P.3.-Number of farms in 1940 
Equal probability 

Equal probability 


Or, COP CO 


40 Strata* 


P.P.S.-Value of characteristic in 1940 
P.P.S.-Value of characteristic in 1940 2 
P.P.S.-Number of farms in 1940 
P.P.S.-Number of farms in 1940 20 
Equal probability 5 
Equal probability 20 


acu 





* No. s.8.u.’s per p. 8. u. is an average figure. 
tems are given in the appendix. The general procedure for derivation 
will be indicated here. Consider the sampling system (Q2,, W;Y;’) as 
illustrated above. It is easy to see that, in general, this system is biased 
for the estimation of Y, because 
E(W;: Y,’) = E,-E.(W;Y;’) = E,(W;Y;,), 


where E, refers to expectation over the first stage of sampling and E; 
over the second stage. This estimate is unbiased if, and only if, Q, 
=1/W;,. In particular, let 2, be equal to P; where >>;P;=1. Thus 


E\(W.Y:) = >> Pi Wi- Yi. 
r 
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This sum equals Y if, and only if, P;=1/W;, for all «. Hence, 
[1/W., W;- Y,’] is an unbiased sampling system. A general expression 
for the mean square error for systems in the class considered is given by 


E[W.Y,’ — Y]? = E[W2(Y,’ — Y)?] + E[W.Y; — E(W.Y)) |? 
Within variance Between variance 
+ [E(W.Y,) — Y]*---.(1) 
Square of Bias 


The mean square error for a particular system is obtained by substitut- 
ing the corresponding values of W; for the system in equation (1) above 
and taking the expectations according to the probability function, Q,, 
e.g. for system A, W;=X/X; and Q,=Fo;/Fo. The mean square error 
for the state estimate is found by summing (1) over all strata, except 
that the bias is first summed over all strata and the total bias is 
squared. 


3. ANALYSIS OF VARIANCE COMPONENTS FOR SYSTEMS 
A, B, C, D, anv E 


In order to compare sampling systems A, B, C, D, and E the esti- 
mated (C.V.)?X10‘, where (C.V.)? is the estimated mean square error 
divided by Y?, are presented in Tables 2 and 3. The within p.s.u. com- 
ponents are shown in Table 2, the total error in Table 3. The between 
components of error may be obtained by subtraction. This latter com- 
ponent includes both the between p.s.u. variance and the bias con- 
tribution for systems A and B. In calculating the between component 
contributions, exact expressions for the expected values have been ob- 
tained. Since information was available for only a one in eighteen sys- 
tematic sample of the Master Sample segments within each county, 
considerable difficulty was experienced in obtaining estimates of the 
within p.s.u. component of the total error. Furthermore, this sample 
embraced incorporated, unincorporated and open country areas. In 
this connection Jessen [5, p. 536] says, “The areas into which the open 
country zone was partitioned serve as units for sampling either or both 
farms and persons whether farm or non-farm. This portion of the sam- 
ple is as useful, therefore, for a sample census of population as for a 
sample census of farms. This dual purpose sampling unit is feasible 
only in the open country, where the majority of the families are en- 
gaged in farming.” Hence only data for the open country area were 
used in the estimation of within p.s.u. variances. The Master Sample 
segment summary cards which belonged to incorporated or unincor- 
porated places or in a few cases to open country areas falling within 
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metropolitan districts, or which formed subunits of multiple units had 
to be dropped. As the open country area included about 90 per cent of 
the total number of segments for North Carolina, the within p.s.u. 
variation for this section of the state is fairly representative of the 
entire state. Strictly speaking, however, the conclusions drawn are valid 


TABLE 2 


ESTIMATED (C.V.)?X10¢ FOR THE WITHIN P.S.U. COMPO- 
NENT OF ERROR 








SAMPLING SYSTEM* At Bt 





CHARACTERISTIC 





Sampling Rate (%) 1 2 


197 Strata (m.c.d.) 
No. of Non-White Operators 135 63 
Value of Land and Buildings 17 Rg 
No. of Days Worked off Farms 154 72 
Total No. of Farms 16 7 


98 strata (m.c.d.) 
No. of Non-White Operators 151 65 
Value of Land and Buildings 18 8 
No. of Days Worked off Farms 193 83 
Total No. of Farms 15 7 


Sampling Rate (%) y 0.5 2 


40 Strata (m.c.d.) 
No. of Non-White Operators 345 38 
Value of Land and Buildings 35 4 
No. of Days Worked off Farms 681 835 99 
Total No. of Farms 27 28 4 


40 Strata (County) 
No. of Non-White Operators 145 z 152 35 
Value of Land and Buildings 41 ys 42 «(10 
No. of Days Worked off Farms e su 
Total No. of Farms 36 9 37 9 





* See Section 2 for definitions of sampling systems 
¢ The bins contibution to the totel erver te inctuded fo the belwess pax. (C.V.)2. 


for the open country area only. It was further assumed that the estimate 
of the within county and m.c.d. variation obtained from systematic 
sampling is approximately equal to that of a random sample if an equal 
number of segments are selected from the same population. 

The within p.s.u. component required the estimation of both within 
county and within m.c.d. variation. For a few of the counties and for 
a great many more of the m.c.d.’s, the number of s.s.u.’s available for 
estimating the within variation was too small to provide efficient 
estimates. Furthermore, of the 941 m.c.d.’s used in this investigation, 
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342 provided no estimates of the within variation, since data were avail- 
able in each of these for either none or only one s.s.u. 

The method of estimation used consisted of pooling the observed 
within p.s.u. variances for contiguous p.s.u.’s so that the resultant 
estimates were based on more degrees of freedom, thus increasing their 
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TABLE 3 


ESTIMATED (C.V.)?X10‘ FOR THE TOTAL COMPONENT 
OF ERROR 








SAMPLING SYSTEM* Bt 





CHARACTERISTIC 





Sampling Rate (%) 1 2 


197 Strata (m.c.d.) 


No. of Non-White Operators 74 168 96 
Value of Land and Buildings 13 27 18 
No. of Days Worked off Farms 612 750 668 
Total No. of Farms 9 19 11 


98 Strata (m.c.d.) 


No. of Non-White Operators 
Value of Land and Buildings 
No. of Days Worked off Farms 
Total No. of Farms 


Sampling Rate (7%) 


208 
34 
1405 
22 


0.5 


122 
24 
1294 
13 


2 


40 Strata (m.c.d.) 
No. of Non-White Operators 112 463 157 
Value of Land and Buildings 35 70 39 
No. of Days Worked off Farms 4004 5235 4500 
Total No. of Farms 17 49 24 


40 Strata (County) 


No. of Non-White Operators 150 39 161 44 
Value of Land and Buildings 42 ll 44 12 
No. of Days Worked off Farms 114 63 162 108 
Total No. of Farms 38 10 44 16 





* See Section 2 for definitions of sampling systems. 
t The bias contribution to the total error is included in the between p.s.u. (C.V.)%. 


stability. This method assumes, of course, that the true within p.s.u. 
variances do not vary for those p.s.u.’s which were combined. Since 
this assumption is not generally valid, the estimates obtained of the 
within variation may be slightly biased, thus affecting any comparisons 
amongst two or more sampling systems. This point needs further in- 
vestigation. Three sets of pooled estimates of within p.s.u. variances 
were worked out. 


(a) The 40 strata with the county as the p.s.u. were grouped into 20 
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strata,‘ each new stratum consisting of two contiguous old strata. 
The within county variances for each new stratum were pooled 
to yield an over-all estimate of variance for the stratum. This 
estimate was used for each county in the stratum. 

(b) For each of the 20 strata obtained in (a) the within m.c.d. vari- 

ances were pooled to yield another estimate of the within p.s.u. 
variance for each stratum when the designs using 40 strata with 
the m.c.d. as the p.s.u. were studied. 
The 98 strata with the m.c.d. as the p.s.u. were pooled into 20 
strata such that the stratification was almost the same as in (a) 
and (b). The within m.c.d. variances within each new stratum 
were pooled to yield an estimate of within p.s.u. variance for the 
stratum. This was applied to each m.c.d. within each stratum, 
when the designs using 98 and 197 strata were studied. 


The within county contributions to total error are considerably 
greater than the between contributions for all the sampling systems. 
Even with the m.c.d. as the p.s.u., this contribution is a very important 
factor. As pointed out in the introduction, this fact was also observed 
by Jebe [4]. Hence it might be feasible to consider a delineation of 
s.s.u.’3 which are more homogeneous than the present Master Sample 
segments. 

It can be seen from Table 3 that the total (C.V.)? for all the char- 
acteristics and for all the sampling systems using 40 strata is less where 
the p.s.u. is a county than when the p.s.u. is an m.c.d. This difference 
is marked for number of days worked off farms for all the sampling 
systems, particularly A and B. One reason for this marked difference 
is the smaller within contribution when the p.s.u. is a county. This 
seemingly anomalous result arises from the instability of the weights 
(W;=X/X,) for number of days worked off farm (and also for num- 
ber of non-white operators) when the p.s.u. is an m.c.d. In many 
m.c.d.’s, X; is very small even though the probability of selection 
(Fo:/Fo) is large; since the within component is E[W,?(Y,’—Y)?’], a 
very small value of X; (large value of W;) can have a tremendous ef- 
fect on the within contribution. The W; are much more stable when the 
p.s.u. is a county. 

Six main comparisons of sampling systems have been made on the 
basis of the relative errors shown in Tables 2 and 3. These comparisons 
can be divided into two groups. 





4 These will be described below in the discussion on the selection of two or more p.s.u.’s from a 
stratum. 
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(a) Comparisons of sampling systems differing in the method of selection 

It would appear from the tables that the biased system A, where 
selection is p.p.s. to number of farms in 1940 but the estimator is the 
ratio to the value of the characteristic in 1940, is more efficient for all 
the characteristics and for all stratifications than the biased system B, 
where selection is made with equal probability with the same estimator. 
The gain in efficiency is, however, very small when 40 counties are 
used for estimating the number of non-white operators and value of 
land and buildings. This does not mean that selection with equal 
probability is almost as effective as p.p.s. selection, since the appropri- 
ate comparison is on only the between p.s.u. components of variance. 
For the between p.s.u. components of variance, system A showed con- 
siderable gain in efficiency relative to system B. System D, where 
selection is made with p.p.s. to value of the characteristic in 1940, is 
more efficient than system A or B for all characteristic totals, except 
for total number of farms where systems A and D are identical. The 
relative efficiency of system D is highly pronounced for estimating the 
number of non-white operators and number of days worked off farms 
when the m.c.d. is used as the p.s.u. 


(b) Comparisons of sampling systems differing in the method of estima- 
tion 

The unbiased system C, in which selection is p.p.s. to number of 
farms in 1940 and the estimator is ratio to number of farms in 1940, is 
generally more efficient than the biased system A, when the m.c.d. is 
used as the p.s.u. The gain in efficiency is most pronounced for number 
of days worked off farms and is identically unity for total number of 
farms. With the county as the p.s.u., the relative efficiency of C to A 
is considerably reduced and is in fact less than unity for value of land 
and buildings and number of days worked off farms. 

The unbiased system E, where selection is with equal probability and 
estimation is accomplished by a simple expansion of the estimated 
p.s.u. total by the number of p.s.u.’s in the stratum, is generally more 
efficient than the biased system B for estimating the number of non- 
white operators and the number of days worked off farms using the 
m.c.d. as the p.s.u. However, the situation is reversed for value of land 
and buildings and total number of farms, for which the correlations 
between the 1940 and 1945 values are both high. When the county is 
used as the p.s.u., system B is more efficient than system £ for esti- 
mating totals of all the characteristics. 

As regards the between components of variance, the county is always 
a better p.s.u. for all the characteristics for biased systems A and B 
compared to unbiased systems C and E£. The reduction in the sampling 
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eiror portion of this component for the biased systems A and B, due 
to the high correlations between the 1940 and 1945 values of each of 
the characteristics with the county as a p.s.u., more than compensated 
for the loss due to the bias contribution. 

Because of the high C.V. for number of days worked off farms, it is 
very difficult to find a sampling system which will give acceptable re- 
sults for all four characteristics. 

Any one characteristic total can be estimated satisfactorily by use 
of system D for that characteristic; however, this procedure will not 
give satisfactory results for the other three characteristics. Suppose the 
standard is set up that, using a two per cent sample, each characteristic 
total shall be estimated with no more than a ten per cent C.V., i.e. an 
accuracy of estimation to within 20 per cent of the item total with 95 
per cent confidence. None of the sampling systems will provide this 
accuracy for all types of stratification and p.s.u. considered here. How- 
ever if 40 strata with the county as the p.s.u. are used, system A will 
meet the standard and system B will almost meet it. If 197 strata are 
used, with a two per cent sampling rate, systems C and £ will provide 
estimates of the characteristic totals within an eight per cent C.V. 

None of the systems considered will provide an estimate of the num- 
ber of days worked off farms within a five per cent C.V. Hence, it was 
deemed advisable to investigate the possibility of sampling from each 
of the 98 counties (this would be single stage sampling). If all 98 coun- 
ties were sampled, the between component of the total error would van- 
ish; hence, the total error would be simply the within component. 
This within component, using the 98 counties as strata, was determined 
easily from the calculations of the within component for system E for 
the 40 strata with the county as the p.s.u. In order to use the existing 
calculations, it is noted that, except for weighting factors, the within 
component of a } per cent sample using 40 counties corresponds to the 
total error for about a 1.25 per cent sample using all 98 counties 
(actually about 5 s.s.u.’s per county) and a 2 per cent sample of 40 coun- 
ties corresponds to a 5 per cent sample of all 98 counties. The estimated 
per cent C.V. using the 98 counties as strata are presented below. 








Sampling Rate 





Characteristic 
1.25% 5% 





Number of Non-White Operators 
Value of Land and Buildings 
Number of Days Worked off Farms 
Total Number of Farms 
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From these results it appears that a sampling rate of 2.5 per cent 
would be sufficient to estimate each characteristic within a five per 
cent C.V. if all 98 counties were included in the sample. 


4. SELECTION OF TWO PRIMARY SAMPLING UNITS FROM A STRATUM 


Only one intensity of stratification was employed in this study. The 
state was divided into 20 strata. Each stratum was formed by combin- 
ing two contiguous strata of the 40 strata with the county as the p.s.u. 
discussed under selection of one p.s.u. from a stratum. Four sampling 
systems F, G, H, and K were considered for each of the four char- 
acteristics. These are: 


F. (it) p.s.u. selection: the first with p.p.s. to the value of the characteristic in 
1940 and the other with equal probability but without replacement from 
the remaining p.s.u.’s in the stratum. 

(ti) s.s.u. selection: random and independently from each of the p.s.u.’s 
selected in (¢) above. The number of s.s.u.’s selected from each of the 20 
strata was proportional to the total number of s.s.u.’s in the open 
country area of the stratum. Two subsampling rates were used, i.e. 0.5 
and 2.0 per cent. 

(iti) estimation: ratio of the estimated total of the characteristic to the total 
value of the characteristic in 1940 for the p.s.u.’s selected in (#). 

(tv) sampling system: 





Xi + X; Y,’ + Y;’ ] 


(N-1)X X:+; 


G. (i) p.p.s. selection: the first with p.p.s. to the number of farms in 1940 and 
the other with equal probability but without replacement from the re- 
maining p.s.u.’s in the stratum. 

(it) s.8.u. selection: same as in F (tt). 

(itt) estimation: ratio of the estimated total of the characteristic to the total 
number of farms in 1940 for the p.s.u.’s selected in G@ (#). 

(tv) sampling system: 





Fo + Fo; Y,’ + Y;’ | 
@-ive Datla 2 


H. (i) p.s.u. selection: the two p.s.u.’s each with p.p.s. to the value of the char- 
acteristic in 1940 but with replacement. 
(it) s.s.u. selection: same as in F (it). 

(itt) estimation: average of the ratios of the estimated totals of the character- 
istic to the corresponding value of the characteristic in 1940 for the p.s.u.’s 
selected. 

(iv) sampling system: 
[AS (= + <<) xX 
xe NX. * %/7 325° 
K. (i) p.s.u. selection: two p.s.u.’s each with p.p.s. to the total number of farms 
in 1940 but with replacement. 
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(tt) s.8.u. selection: same as in F (iz). 
(iii) estimation: average of the ratios of the estimated total of the char- 
acteristic to the total number of farms in 1940 for the p.s.u.’s selected. 
(iv) sampling system: 
2F iF oj Y,’ Y,’ Fo 
[ Fo | \Po +t 5) 2d" 

The method of selecting the p.s.u.’s for systems F and H depends on 
the value of the characteristic for these units in 1940. Therefore, these 
systems would be useful only for specific purpose surveys. On the other 
hand, systems G and K for which the method of selection of the p.s.u.’s 
is based on number of farms in 1940 would be suitable for general pur- 
pose surveys. Expressions of variances for each of the sampling systems 
described above are given in the Appendix; however, the procedure for 
arriving at an expression for the variance will be indicated for one of 
the systems. Consider the unbiased sampling system 


S. xX Jf t 
F:| + : ’ . = de x]. 
(VN-1)X X;+X; 


Y,’ Y/ ry,’ Y,’ 2 
Var. |= <x] - 2 x - y]} 
X;+ X; L Xs + X; J 








TY,’ + Y;’ Y;+ Y 
= BY tS tom eS a 


LX: + X; Xi + X; 
ot Y; 
+t x-r]} 
Xi: + X; 
Y; + Y;)? 
- { 1 (Y;+ *x-rh 





22 (N—1) (X:+X) 


Between 
(Z; + Zj)X 
+{xzd \, 
cg (N — 1)(Xi + X)) 
Within 
where Z;=M;(M;—m,)o2/m. 
5. ANALYSIS OF VARIANCE COMPONENTS FOR SYSTEMS F, G, H, anp K 


This section will deal with the analysis of the variance components 
of the sampling systems described in Section 4. The results of the 
analysis for systems C and D, where only one p.s.u. is selected from a 
stratum, are also presented here to facilitate comparative study with 
systems in which two p.s.u.’s are selected from a stratum. In this study, 
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as before, the square of the coefficient of the variation, (C.V.)?, will be 
used for the total error. Calculations for both the between and within 
components of error have been made only for systems F and G, in 
which the selection of p.s.u.’s is made without replacement. For sys- 
tems H and K, in which the p.s.u.’s are selected with replacement, cal- 
culations have been made on the between p.s.u. components of the error 
only. 

From Tables 4 and 5 it would appear that systems F and G are, re- 
spectively, more efficient than systems D and C in estimating number 
of days worked off farms and are highly so on the between components 


TABLE 4 


BETWEEN COMPONENTS OF (C.V.)*X10‘ FOR SAMPLING 
SYSTEMS C, D, F, G, H, AND K*. 








Sampling System G K Cc F H D 





Measure of Size 1940 No. of Farms 1940 Value of Char. 
CHARACTERISTIC 

No. of Non-White Operators 16 42 16 

Value of Land and Buildings 10 27 7 

No. of Days Worked off Farms 82 252 111 

Total No. of Farms 1 4 2 





* For definition of sampling systems C and D, see Section 2 and for F, G, H, and K, see Section 4. 
(C.V.)2 X10 rounded to the nearest integer but (C.V.)* calculated correct to the sixth decimal figure. 


TABLE 5 
TOTAL (C.V.)*X10¢ FOR SAMPLING SYSTEMS C, D, F, AND G 








Measure of Size 1940 No. of Farms 1940 Value of Char. 
Sampling System G Cc F D 
Sampling Rate % 0.6 2 0.5 2 05 2 0.5 2 

CHARACTERISTIC 
No. of Non-White Oper- 

ators 125 42 112 39 115 30 109 28 
Value of Land and Bldgs. 52 20 45 16 45 12 40 il 
No. of Days Worked off 

Farms 127 93 152 121 92 55 96 60 
Total No. of Farms 41 il 38 10.5 41 ll 38 10.5 





of variance. With an increase in the sub-sampling rate from 0.5 to 2.0 
per cent, there is an appreciable reduction in the total error. Systems 
F and G have an additional advantage over systems D and C, respec- 
tively, in that they would provide estimates of the between components 
of error from the sample. On the between components of variance, sys- 
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tems F and G are, respectively, more efficient than systems D andC 
in estimating the total number of farms. This seems to be a paradox, 
since the between p.s.u. (C.V.)? for any of 20 strata for systems F and 
G is expected to include in addition some between strata variance (of 
the original 40 strata). The method of estimation introduced in estimat- 
ing the population total by the selection of two p.s.u.’s seems to cut 
down the variance of the estimated total, because the ratio (Y1’+ Y;’) 
/(X;+X;) may be less variable than Y,’/X;. This reduction is naturally 
more pronounced where the between p.s.u. contribution is high, i.e., 
with number of days worked off farms for which the reduction in vari- 
ance due to the method of estimation more than compensates for the 
increase in variance due to the inclusion of some between strata varia- 
tion. 

The between components of the total error for systems F and G, 
where selection is made without replacement, are highly efficient com- 
pared to the same components for systems H and K where selection is 
made with replacement. 

With a two per cent sample, it is estimated that system G would 
provide an estimate of each characteristic total within a ten per cent 
C.V. and within a seven per cent C.V. for all characteristics excepting 
number of days worked off farms. For a specific purpose survey, sys- 
tem F should be used. 


6. CONCLUSIONS 


Of the four characteristics investigated in this paper, one was in- 
fluenced greatly by war-time activities between 1940 and 1945. None of 
the sampling systems studied was suitable to estimate this item, num- 
ber of days worked off farms, with seven per cent accuracy when a 
sampling rate of two per cent is used. If a stratified sample using the 
98 counties as strata is substituted for a two-stage sampling system, a 
1.25 per cent sampling rate would be sufficient to estimate this item 
with a five per cent C.V. The conclusions which follow relate to the 
other three characteristics only, namely number of non-white operators, 
value of land and buildings and total number of farms. 

For a general purpose survey to estimate characteristic totals, sys- 
tem C is recommended. This system will provide estimates for all the 
types of stratification considered with no more than a nine per cent 
C.V. if a two per cent sampling rate is used. 

Among the four types of stratification and p.s.u. considered in this 
study, two of them, namely, 40 strata using the county or 197 strata 
using the m.c.d. as the p.s.u., were found to be most suitable from the 
point of view of total error and are, recommended for estimating the 
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characteristic totals. With a two per cent sampling rate using system C 
and either of these types of stratification, estimates of the characteristic 
totals could be obtained within a C.V. of six per cent. 

The within p.s.u. component of error is not reduced to the same ex- 
tent as the between p.s.u. component is increased, when the m.c.d. is 
used as the p.s.u. instead of the county with 40 strata. Even with the 
m.c.d. as the p.s.u., the within contribution to the total error is a very 
important factor for any reasonably practicable sample size. 

The efficiencies of p.p.s. and equal probability selection are compared 
in systems A and B. The method of estimation, ratio to the value of the 
characteristic in 1940, is the same in both systems. For all cases investi- 
gated, system A (p.p.s. selection of p.s.u.’s with number of farms in 
1940 as the measure of size) was found to be superior to system B (selec- 
tion of p.s.u.’s with equal probability). 

Although system D, in which selection is p.p.s. to the 1940 value of the 
characteristic, is impracticable when it is desired to estimate the totals 
of a number of characteristics from a single sample, it is generally suit- 
able for specific purpose surveys, where one is interested in only a single 
characteristic. However, system C was found to be nearly as efficient 
as system D in estimating totals for the characteristics studied in this 
paper. This may not be true in general. 

The choice of a sampling system has been based on the assumption 
of equal costs per schedule. There is a real need for using a more real- 
istic cost function, but this would necessitate acquiring adequate and 
accurate information on the various cost factors, e.g., cost of travel 
and cost of enumeration in survey designs. 

The principal contribution in this paper is the examination of a 
sampling design for the selection from a stratum of two p.s.u.’s, the 
first with p.p.s. and the second with equal probability from the re- 
maining p.s.u.’s. The advantage of this design over one involving selec- 
tion of a single p.s.u. from each stratum is that it permits an unbiased 
estimate [11] of the sampling error from sample data. In some cases, in 
addition, the system involving the selection of two p.s.u.’s per stratum 
may be more efficient than systems permitting the selection of only 
one p.s.u. from each of twice as many strata. It appears that this will 
be true when there is extreme variability among the p.s.u.’s, such as in 
the case of number of days worked off farms. The between p.s.u. vari- 
ance for the three remaining items studied is much lower, but even in 
these cases the efficiencies of the systems selecting one p.s.u. per stra- 
tum do not greatly exceed those in which two p.s.u.’s per stratum are 
selected without replacement. 

A considerable part of the material in this paper is taken from the 
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senior author’s thesis submitted in partial fulfilment of the require- 
ments of the Ph.D. degree in Experimental Statistics, North Carolina 
State College, 1952. Correspondence with E. H. Jebe of Iowa State 
College and Morris Hansen of the Bureau of Census helped in the gen- 
eral orientation and understanding of certain aspects of the problem. 
The computational work involved in this study was enormous and 
would have remained incomplete but for the special operations devised 
by members of the I.B.M. staff. The authors wish to express their ap- 
preciation to D. G. Horvitz for offering many valuable suggestions in 
the final preparation of this paper. 
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APPENDIX 


The expressions given below are the mean square errors for the esti- 
mates for each of the nine sampling systems presented in this paper 
(Section 2 and Section 4). As before A, B, C, D, and E stand for the 
systems in which one p.s.u. is selected from a stratum and F, G, H, 
and K for the systems in which two p.s.u.’s are selected from a stratum. 
For simplicity of notation only the results for a single stratum are 
presented. These results when summed over all strata will provide the 
mean square error of the estimate for the entire state, noting that the 
bias is first summed over all strata and this sum is squared. 
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where oa,” is defined as for V4. 
Sampling system C: 
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where o;,? is defined as for V4. 
Sampling system D: 
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where Z;=M,(M;—m,)o;?/m; and o;? is defined as for V4. 
Sampling system G: 
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Sampling system H: 
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Sampling system K: 
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COMBINING INDEPENDENT TESTS OF SIGNIFICANCE* 


ALLAN BIRNBAUM 
Columbia University 


It is shown that no single method of combining independent 
tests of significance is optimal in general, and hence that the 
kinds of tests to be comkined should be considered in selecting 
a method of combination. A number of proposed methods of 
combination are applied to a particular common testing prob- 
lem. It is shown that for such problems Fisher’s method and 
a method proposed by Tippett have an optimal property. 


1. THE PROBLEM AND SOME PROPOSED SOLUTIONS 


HE problem of combining independent tests of significance has been 

discussed and illustrated by a number of writers, including Fisher 
[2], Karl Pearson (cf. [4]), Wallis [7], and E. S. Pearson [4], to which the 
reader is referred for general discussions to supplement the present 
brief section. The formal statistical problem may be stated as follows: 
A hypothesis Ho is to be tested. An observed value 4 of a statistic 
has been obtained; the best test of Ho based on this statistic would 
indicate rejection at the wu significance level. That is, w is the “prob- 
ability level” corresponding to the observed value ¢,; for example, if 
large values of the statistic are critical for Ho, then wu; is the probability 
that a value as large as or larger than that observed will occur under 


Ho. Similarly, independent values of statistics, tz, - - - , &, have been ob- 
tained, and in the respective best tests of Hy based on these statistics 
the corresponding “probability levels” are w2, ---, ux. The essential 


requirement of independence of the ¢,’s will be satisfied if each ¢; is 
based on a separate and independent set of data; if each ¢; is based on 
the same set of observations, the ¢,;’s must be known to be statistically 
independent functions of the observations. 

The problem of “combining these independent tests of significance” 
then is the problem of giving a test of H» on the basis of a set of ob- 
served values (probability levels) w, w,..., ux. The test is not to 
utilize the observed values 4, é, . .. , &; in general, it is assumed that 

(a) either these values or else the forms of the distributions of 
hh, fa, ..., &, are unknown to the statistician confronted with the pres- 
ent problem; or 





* Work sponsored by the Office of Naval Research. 
The writer is grateful to Professor Henry Scheffé for helpful comments on the first draft of this 
paper. Responsibility for any remaining deficiencies is the writer's. 
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(b) this information is available but the distributions are such that 
there is no known or reasonably convenient method available for con- 
structing a single appropriate test of Ho based on (h, f,..., tk). 

Procedures for combining independent significance tests may be of 
practical use even in some situations in which the statistician has com- 
plete freedom to determine the design of a complex experiment. Sup- 
pose, for example, that a scientific hypothesis asserts that a change in 
the value of an independent experimental variable will alter the dis- 
tribution of one or more of k observable variables h, t, ... , t&. For 
example, an hypothesis to be tested may assert that administration to 
subjects of a certain drug will have at least one of the three effects: 

(a) an increase in the mean of a certain measurable physiological 
quantity, 

(b) an increase in the variance (within a subject) of a second meas- 
urable physiological quantity, and 

(c) a decrease in the probability of a subject’s correctly making a 
certain sensory discrimination. 

Suppose that optimal tests for each of these effects separately could 
be based respectively on statistics 4, 4, and ts. In such situations con- 
struction of a single optimal test for the presence of one or more of the 
effects may be difficult or impossible. However, combining statistically 
independent tests based on h, é, and é;, a single test at a desired signifi- 
cance level can be given. With appropriate design, this test will also 
meet given requirements of power to detect one or more of the three 
effects. It is shown in Section 3 that for some problems such a test will 
even have certain efficiency properties. 

To avoid technical complications not of direct interest here, let us 
assume that the ¢,’s have continuous distributions (densities). (See [7] 
for a discussion of the important discrete case.) Since u; is the prob- 
ability when Hp is true of observing a value of our 7th statistic at least 
as large as t;, we may write 


(1) u= Uint;) -f pi(t,)dt; 

ty 
where p,(t;) is the probability density function of ¢; under Ho. Then 
the probability that u; lies in any interval, say u’Su;Su’’, equals 
u’’—u’, or in other words u; has a uniform distribution on the unit 
interval under Ho with density 


0s u; 31, 


1 
2 y= E 
@) Fu) . u; <0,u; > 1, 
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for each ¢, and the u,’s are mutually independent. 

Each method of combining tests is a rule prescribing that H» should 
be rejected whenever the set of values (u;,---, ux) falls in a certain 
critical region. Intuitively speaking, small values of the u,’s are indica- 
tive of rejection; to discuss satisfactorily the problem of constructing 
a critical region of values of (u;, - + - , uz), we must consider the possible 
distributions of the u,;’s when Hp is false. We shall assume here that 
whenever'a u; has a non-uniform distribution, it is distributed on the 
unit interval according to some (unknown) density function g;(u;) 
which is non-increasing.! 

Depending on the nature of the experimental situations in which the 
u;’s are obtained, the appropriate alternative hypothesis would be 
either: 


Ha: All of the u,’s have the same (unknown) non-uniform, non- 
increasing density g(u). 
or: 
Hz»: One or more of the u,’s have (unknown) non-uniform densities 
gi(ui). 


Under Hg, the ¢,’s are statistics of the same kind obtained from k 
replications of an experiment, in which the underlying conditions are 
assumed to remain constant with Ho false. Under Hz, the ¢;’s may be 
statistics of different kinds (for example, a normal mean and a normal 
variance), and the conditions under which the ¢,;’s are obtained need 
not be the same; it is assumed only that Hp is false in the case of at 
least one of the ¢,’s. H4 is seen to be a special case of Hz. Probably in 
the majority of applications, Hg is the appropriate alternative hy- 
pothesis.? 





1 This assumption is not a strong one for our purposes: Suppose large values of the statistic ¢ are 
critical for testing Hs against Hi, and the probability densities of t under Hs and Hi, are p(t) and p’(é), 
respectively. Then the definition of the statistic u is 


(3) uw =u(f -f p(t)dt, 
s 


80 du/dt = —p(t). If the probability density of ¢ is p’(¢) then that of u is 
(4) g(u) = p'(t)/|du/dt| = p’()/p. 


Hence g(u) will be a non-increasing function of u if and only if p’(é)/p(¢) is a non-decreasing function of 
t. The latter condition is satisfied for most distributions commonly encountered in applied statistics, 
including those of normal, binomial, and Poisson means, normal variances, and all other distributions of 
the Koopman form described in Section 3 beiow. 

2In some papers cited above, the distinction between the alternatives Ha and Hg seems not to 
have been made sufficiently clear. Problems corresponding to Ha are considered by Wallis on p. 238 
of [7] and by Pearson on p. 142 of [4]. Problems corresponding to Hg are considered by Wallis on pp. 
245-56 of [7] and by Pearson on p. 138 of [4]. 
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Some of the methods which have been proposed for combining inde- 
pendent tests of significance (i.e., for constructing critical regions of 
values of (11, ue, - + - , Ux)) are the following: 

(1) Fisher’s [2] method: reject Hy if and only if ww. 
where c is a predetermined constant corresponding to the desired sig- 
nificance level. Wallis, on pp. 231-34 of [7], discusses in detail Fisher’s 
method of appropriately determining c. It turns out that —2 log 
Uy"Ue + - + ‘Ux is distributed as chi-square with 2k degrees of freedom when 
Ho is true. If dis such that 


(5) Prob {x,,2 24} =a 


where 1—a is the desired significance level, then setting —2 log c=d, 
we obtain c=e—4/?, 

(2) Karl Pearson’s method: reject Ho if and only if (1—)(1—we) 

-++(l—wuz) 2c, where c is a predetermined constant corresponding 
to the desired significance Jevel. In applications, c can be computed by 
a direct adaptation of the method used to calculate the c used in 
Fisher’s method. 

(3) Wilkinson’s [8] methods: reject Ho if and only if u;Sc for r or 
more of the u,’s, where r is a predetermined integer, 1 Sr<k, and c is 
a predetermined constant corresponding to the desired significance 
level. The k possible choices of r give k different procedures which we 
shall refer to as case 1 (r=1), case 2 (r=2), etc. For example, if k=2 
and a test at the .95 significance level is desired, the case 1 procedure 
is: reject H» if either w or u2 or both equal or exceed c = (.95)/? = .974; 
the case 2 procedure is: reject Ho if both uw, and we equal or exceed 
c=1—(.05)”?=.776. Case 1 was proposed earlier by Tippett [5]. 

In the following sections, certain bases for selecting methods of 
combination for particular problems will be developed. 


2. A GENERAL CONDITION FOR ADMISSIBILITY OF 
METHODS OF COMBINATION 


The following condition is readily seen to be satisfied by each of the 
proposed methods described above: 

Condition 1: If Ho is rejected for any given set of u,’s, then it will 
also be rejected for all sets of u,*’s such that u,;* Su; for each ¢. 

Any method of combination which failed to satisfy this condition 
would seem unreasonable. In fact, it is not difficult to prove that the 
best test of Ho against any particular alternative Hz of the kind de- 
scribed above satisfies Condition 1. (A proof is given in the Appendix.) 
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Since Condition 1 is satisfied by so many possible methods of com- 
bination, the question arises whether any further reasonable condition 
can be imposed to narrow still further the class of methods from which 
we must choose. The answer is no: So long as we consider the problem 
in the present generality, and not with reference to a particular kind 
of testing problem, there are no restrictions on the possible forms of the 
density functions g,(u;) except that they be non-increasing. And it can 
be shown (see the Appendix) that for each method of combination 
satisfying Condition 1, we can find some alternative Hg represented by 
non-increasing functions g:(w), - - - , ge(ux) against which that method 
of combination gives a best test of Ho. 

These considerations prove that to find useful bases for choosing 
methods of combination, we must consider further the particular 
kinds of tests to be combined in any given problem. In the following 
sections, it is shown that certain methods are optimal for certain im- 
portant categories of testing problems. 


3. DISTRIBUTIONS OF THE KOOPMAN FORM 


Nearly all of the density functions and discrete probability distribu- 
tion functions encountered frequently in applied statistics can be 
written in the so-called Koopman form, which is 


(6) f(x, 0) = c(8)a(4)*@(z) 


where @ is a parameter of the distribution and z is an observed value, 
and a, b, c, and ¢ denote arbitrary functions. Examples are 
1. The binomial: 


0 sao(‘)et rena m4) (2) 


2. The normal, with known (say unit) variance and mean @: 





(8) f(z, 0) = te = Le gt itgteg ot, 
J/24r /24r 

Other examples are the Poisson and exponential distributions and 
the normal distribution with known mean and unknown variance. 

Consider a problem of combining k independent significance tests, 
each of which is a test on a distribution of the Koopman form (all k 
distributions need not be of the same sort however; for example, one 
might be on a normal mean and another on a binomial mean, as in the 
illustration in Section 2 above). A method of combining these tests 
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will be equivalent to a test of a hypothesis specifying the values of k 
parameters, 


Ho: 4; == 6,°, Te ¢ 5 a = 6°, 


on the basis of the observed values of the statistics t,---, &. For 
such problems, a minimal criterion for the reasonableness of a test is 
known. “Reasonableness” is used here in the sense of admissibility of 
a test, which may be defined as follows: A test is admissible if there is 
no other test with the same significance level which, without ever being 
less sensitive to possible alternative hypotheses, is more sensitive to 
at least one alternative. In other words, an admissible test is one which 
cannot be strictly (that is, uniformly) improved upon. A necessary 
condition for admissibility of a test of Ho in our problem is that the 
acceptance region of the test (that is, the values of (4, ---, &) for 
which the test accepts H,) be convex. (A region is convex if the line 
segment connecting each pair of points in the region lies entirely in the 
region.) This is shown in [1]. 

We may illustrate both this condition for admissibility and its appli- 
cation to methods of combination by considering the problem of com- 
bining two tests on means of normal distributions with known (say 
unit) variances. (The performance of Fisher’s method when applied 
to such a problem has been considered by Wallis (pp. 237-39 of [7]) 
and by Pearson (p. 142 of [4]). Let #, denote the mean of a sample of 
m, observations obtained in an experiment in which the underlying 
population mean had the unknown value u;; let Z be the mean of a 
sample of nz observations in a similar experiment in which the unknown 
population mean was ype. In this case any method of combining tests of 
the two hypotheses n:=0 and »,=0 is equivalent to a test of Ho: 
ti =p2=0; then H, would specify u:=u2~0; and Hg would specify 
that either 4; or wz or both are not zero. Let 4.=+/md; and t= Veto. 
Then any method of combining the tests on 4: and ye can be represented 
as a test of Ho by its critical region in the (hh, %) plane. Each of the 
methods of combination described above has been applied to the 
present problem, and the critical region corresponding to each method 
is illustrated in the figures below. The significance level a=0.5 was 
used throughout. The tests on yu; and yw, to be combined were taken 
first to be against two-sided alternatives (Figures 1-4) and then against 
one-sided alternatives (Figures 6-9). In each case the critical region 
was obtained by first determining the values of u; and wu for which the 
method of combination considered would reject Ho at the .05 signifi- 
cance level, and then plotting the corresponding values of ¢; and & by 
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use of the equations relating the ¢,’s to the u,’s. These equations are, 
for the two-sided tests, 


(9) “us = — e~?*/2dy, for ¢ = 1, 2, 
2m jes 
and for the one-sided tests, 


1 ty 
u; = = I e~"/2dy, for ¢ ={1, 2. 


SSS 


(10) 




















™ 
SN 


Fia. 1. Wilkinson’s Method, Case 1. 


We can now apply the condition for admissibility of a test described 
above: The acceptance regions obtained by Wilkinson’s method, case 
2, and by Pearson’s method are not convex. Hence they represent tests 
of Ho, and corresponding methods of combination for the present 
problem, which can be strictly improved upon by other tests and corre- 
sponding methods of combination. Present knowledge does not provide 
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methods of finding tests which actually do strictly improve upon a 
given inadmissible test in problems like the present one. However, it 
seems advisable in selecting tests to restrict consideration to the class 
of admissible tests, and to select from this class a test which seems to 
have relatively good sensitivity (power) against the range of alterna- 
tives of interest. 


te 
































Fig. 2. Wilkinson’s Method, Case 2. 


It is shown in [1] that, for a category of problems including the 
present one, convexity of the acceptance region is a sufficient as well 
as necessary condition for admissibility. 

The remaining two methods of combination, Wilkinson’s case 1 
and Fisher’s, correspond to admissible tests. Inspection of Figures 1 and 
3 suggests that each is fairly sensitive to departures from Hp in all 
directions; that Fisher’s method comes close to that test of Ho (repre- 
sented in Figure 5) which, if m1 =m, and if the seriousness of a depar- 
ture from Ho is measured by y:?+ 2”, is the best test at the .05 level 
(as Wallis noted in [5]); and finally that Wilkinson’s method, case 1, 
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gives a relative concentration of sensitivity to alternatives in which 
the departure from Hp occurs in just one of the parameters. Similar 
observations can be made in Figures 6 and 8. Hence, it seems war- 
ranted to make a choice between the two methods remaining under 
consideration on the basis of a subjective appraisal of the context in 
which a problem like the present one actually occurs; probably in 
most cases Fisher’s method would be preferred. 








os 


Fia. 3. Fisher’s Method. 


Having considered in detail a problem involving a particular distri- 
bution of the Koopman form, we proceed now to show that similar 
considerations apply to the whole class of such distributions. It can be 
verified easily that if Wilkinson’s methods are used to combine tests on 
any such distributions, the result corresponds to a test whose accept- 
ance region has a rectangular boundary like those in Figures 1, 2, 6, 
and 7, and is convex only in case 1. Hence, only case 1 of Wilkinson’s 
method corresponds to an admissible test for certain of the Koopman- 
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form distributions being considered. The remaining cases of the method 
correspond to inadmissible tests for all Koopman-form distributions. 
With little more difficulty it can be verified that Pearson’s method 
does not give a test of Hy with convex acceptance region for any 
Koopman-form distributions (consider the three points in the (4, &) 
plane corresponding to (11, us) =(1—c, 0), to (w4, we) = (0, 1—c), and to 
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Fig. 4. Pearson’s Method. 


(us, U2) =(1—+/c, 1— Vc), for the case (k =2). Thus, Pearson’s method 
also may be removed from consideration as inadmissible for Koopman- 
form distributions. Fisher’s method does seem to give tests of H» with 
convex acceptance regions for Koopman-form distributions; considera- 
tion of the points in the (4, &) plane corresponding to (wm, we) = (1, c), 
to (uw, U2) =(c, 1), and to (wm, ue) =(/c, Vc) suggests this, and for par- 
ticular distributions it may be possible to verify it fully without too 
much difficulty. For example, for 
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1 
(11) f(z, 8) = 7 gel, z>0, 


to combine two one-sided tests on 6 based on one observation each, we 
have 


(12) Us, = e~%!%, fort = 1, 2, 


and the critical region u;uw2Sc corresponds to a test with the convex 
acceptance region 2:+22S —@ log ce. 


= 





\ 


. 





a 


Fig. 5, Best Symmetric Test Against Ha. 


4. CONCLUSIONS 


While there is no single method of combining tests which is best for 
all problems, it appears that to combine independent tests on Koop- 
man-form distributions (these include most distributions commonly 
occurring in applied statistics) one should choose between Fisher’s 
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method and Wilkinson’s method, case 1 Fisher’s method appears to 
have somewhat more uniform sensitivity to the alternatives of interest 
in most problems. For any particular distributions, investigations may 
be made paralleling those above to obtain a still more conclusive basis 
for choice of a method of combination. 
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Fic. 6. Wilkinson’s Method, Case 1. One-sided Alternatives. 


APPENDIX 


To prove that, as stated in Section 2 above, every test of Ho which 
is best against some particular alternative specifying non-increasing 
densities, satisfies Condition 1, we use the well-known fact, proved for 
example in [3], that any best critical region consists of points satisfying 


6 gi(ur) - + + ge(ux) . 
fi(us) +++ felt)” 


(13) c some constant. 
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Now f.(u;) =1 for OSu;S1, i=1,---, &. Hence, AX=gi(ue) + - - gu(ur). 
As the g;(u;)’s are non-increasing, gi(w’) - - > ge(ux’) [gi(wi) - + - ge(un) 
2c if (uw,-+-+ us) is in the best critical region and if u,;’Swu; for 
i=1,---, k. Thus Condition 1 is satisfied. 

However, in general Hz (and even H,) will include a whole set of 
possible forms of the g;(u;)’s, and it is not true in general that there will 
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Fig. 7. Wilkinson’s Method, Case 2. One-sided Alternatives. 


exist a single test of Ho which is uniformly best against all possibilities. 
This is illustrated most simply in the following case under Hz: If 
gi(u:) is uniform and g2(ue) is nonuniform, a best critical region consists 
of all (uw, we) such that weSc, c some constant; if go(w2) is uniform and 
gi(u:) nonuniform, a best critical region consists of all (1, we) such that 
% Sc’, c’ some constant; thus, there is not a single critical region which 
is best against each alternative. It can be verified directly that every 
best test of Hy against a “Bayes mixture” of simple alternatives under 





572 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1984 


Hz also satisfies Condition 1. It follows, as shown by Wald in [6], 
that under general assumptions Condition 1 is a necessary condition 
for admissibility of a test of Ho against a composite alternative Hp. 
We shall show next that, as stated in Section 2 above, each method 
of combination meeting Condition 1 is best against some particular 
alternative hypothesis Hz. Taking k=2 for simplicity, any critical 
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Fig. 8. Fisher’s Method. One-sided Alternatives. 


region w of values of (u, te), if it satisfies Condition 1, can be charac- 
terized by giving its boundary function w(u:), a non-increasing func- 
tion such that w consists of all points (uw, ue) in the unit square 
0Su:51, 0SuS1 for which u.<1u(u). Let u2(u1) be any such bound- 
ary junction. Let ge(uw)=3(2—w) for OSwS1, and let g(u) 
= $c(2 —ue(w))—! for OS u:S1, where c is determined by the condition 
that Sigi(uy)duy = 1. A best critical region for testing Ho against the 
alternative g:(w1), go(we) is the set w’ on which gi(w)g2(u2)>c. But 
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gi(us)g2(U2) =c(2—w2)/(2—ue(t)) >c if and only if we<uw(u). Thus 
the arbitrarily given boundary function w2(w) characterizes a best 
critical region w’. 

(Similar methods give analogous results for the problem of testing 
Hy against H4, with Condition 1 now strengthened by the requirement 
that the boundary function u2(u) be symmetric about the line wu; = uz.) 
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Fia. 9. Pearson’s Method. One-sided Alternatives. 
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MINIMUM LIFE IN FATIGUE 


A. M. FREUDENTHAL AND E. J. GuMBEL 
Columbia University 


N CONVENTIONAL fatigue testing a specimen is repeatedly stressed in 

bending, torsion or tension-compression during imposed force-cycles 
of constant amplitude. The number N of cycles at which the specimen 
breaks, which is a function of the applied stress-amplitude S, is the 
observed variate [1]. The problem is to obtain, for specimens of speci- 
fied material, shape and size, the probability of surviving up to N 
repetitions of a specified stress-cycle in a given testing procedure. 

In a previous paper [2] fatigue was interpreted as an extremal phe- 
nomenon. A statistical scheme for the analysis of fatigue failures was 
developed and applied to the few observed series which satisfy the 
criteria for the applicability of any statistical procedure. The prob- 
ability of surviving up to N cycles was obtained from the asymptotic 
theory of smallest values of a non-negative statistical variate. In this 
derivation it was assumed that a non-zero probability of failure existed 
even at the first cycle. This led to a good fit of the theory to tests made 
on copper and on aluminum at certain stress levels. However, it did 
not fit well enough the tests at other (lower) stress levels or of other 
metals. It appeared that such metals under certain stress levels would 
survive with certainty a substantial number of stress cycles, which 
thus represents a threshold value of the phenomenon. 

Therefore a minimum number of cycles (“Minimum life”) is now 
introduced into the survivorship function. Although this generalization 
does not change the fundamental nature of the probability function, a 
new estimate of the parameters becomes necessary. The extremal dis- 
tribution function used in the following has already been introduced 
on a purely empirical basis by Weibull [6, 7] in his analysis of the dis- 
tribution of the stress amplitude S for constant values of N. 


1. THE LINEAR THEORY 


In the previous theory [2] the probability 1(N)s of surviving N cycles 
under the stress S was 


(1.1) U(N)s = exp [— (N/Vs)*], 
with the boundary conditions valid for all values of S 
(1.1’) lO)s=1; Uo)s=0. 

575 
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In equation (1.1) Vg is the characteristic number of cycles correspond- 
ing to the probability 


(1.2) UVs)s = 1/e 


and 1/as is proportional to the standard deviation of the logarithms of 
the number of cycles. If we write 


(1.3) In[—mlUN)s] =y; = 1/as’ = 0.43429/azg 


where in stands for the natural logarithm, the relation (1.1) takes on 
the linear form 


(1.4)  y=as'(logN —log Vs); U(N)s = exp [— e+] 


where y is a reduced variate without dimension and log stands for the 
common logarithm. The survivorship function (1.4) which is known by 
the actuaries as the Gompertz function has recently been tabulated 
by the National Bureau of Standards [3]. 

Instead of the theoretical limits 0<N< © practical limits for the 
survivorship function are given by the interval 


(1.5) Vse712- Blas < N S Voge?-Hlae 


for the number of cycles. In this approximation probabilities of the 
order 10~ are neglected. 

For homogeneous material and testing procedures, the probability of 
surviving a fixed number of cycles N decreases with increasing stress 
amplitude S. Since the relation (1.4) between log N and y is linear and 
as’ is the slope of this line the ag are constant and independent of 
S which means that the probabilities 1(N)s traced on extremal prob- 
ability paper against log N are parallel straight lines, which therefore 
cannot intersect. It follows that the estimates of ag should be constant 
within errors of random sampling. However the acceptable domain of 
variation of the estimates cannot be established a priori since it de- 
pends upon the spacing of the stress levels, i.e., on the experimental 
design. 

The first equation (1.4) gives a graphical criterion for the validity 
of the theory: The logarithms of the observed numbers of cycles N at 
fracture traced on extremal probability paper against the reduced 
variate y should be scattered about the straight line 


(1.6) log N = log Vs + y/as’. 


In this formula the y corresponding to the observed numbers 
N.(m=1.2 - - - n) are obtained from the plotting positions [2] 
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(1.7) UNm)s = 1 — m/(n + 1) 


where m stands for the ranks of the observed numbers N,, arranged in 
increasing magnitudes and n is the total number of specimens tested. 

It was shown [2] that the theory (1.1) fits torsion fatigue test results 
for copper and aluminum at relatively high stress levels. However, for 
low stress levels the test results approach a curve which, for small 
numbers of cycles, is definitely bending to the right (upwards). This 
means that fracture can occur only after a certain number of cycles. 
This number is henceforth called the minimum life. It is a physical 
constant for given material and testing procedure and its estimate is 
subject to statistical variations. 

In addition to the graphical criterion, there is a numerical criterion: 
If the theory (1.1) holds, the arithmetic and geometric standard devi- 
ations o(N) and o(log N) are related as shown in Table I of the previ- 
ous paper [2]. This relation, traced in Figure 1 leads to the following 
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Fia. 1. Criterion for the validity of the linear theory. 
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procedure: We estimate the characteristic number Vs, calculate the 
standard deviation s(N) and the geometric standard deviation s(log N), 
and check whether the quotient s(N)/Vs corresponds to the value 
prescribed by the graph from the sample value s(log NV). If this rela- 
tion is not fulfilled, the assumption cannot be sustained. 


2. THE GENERAL THEORY 


If the assumption that the minimum life is zero is refuted by one of 
the above criteria, the asymptotic probability function (1.1) is used 
limiting the variate N by the condition N= No,s, whence 


with the properties 
(2.2) \(Vs)s = 1/e U(N)o,s)s = 1. 


For a given stress amplitude S all specimens survive any number of 
cycles up to the minimum life No,s. For values approaching No,s the 
survivorship function (2.1) is bent to the right and approaches a 
straight line perpendicular to the N axis as shown in Figures 7 and 9. 

In (2.1) Vg and No,s are parameters of location, have the dimension 
of N and the relation Vs>No,s=0 while 1/as is a parameter of scale 
without dimension. In the case No,s=0 the formulae of the previous 
paper ]2[ are obtained. Within the framework of the extremal theory 
the generalization (2.1) of (1.1) is legitimate since a linear translation 
of an extreme is still an extreme. 

The parameter Vs has the same meaning in the previous (linear) and 
the present (general) theory. Since it corresponds to a common fixed 
probability it decreases in both theories with increasing values of S. 
However, within the two theories different estimators have to be used. 
For all values of S where the minimum life No,s differs from zero, it 
decreases with increasing S. 

In the special case ags=1, 1/as’ = 0.43429 the probability of survival 
(2.1) degenerates into an exponential function. This probability traced 
on semi-logarithmic paper is a linear function of the numbers of cycles. 
The parameter Vs coincides with the mean Ws. The standard deviation 
o(N)s is 
(2.3) — o(N)s = Vs — Nos. 


Therefore, the minimum life No,s may be estimated in this case by the 
difference of the mean and the standard deviation of the number of 
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cycles. Another estimation based on the smallest number of cycles, 
which in this case is the most precise one, was given by J. Neyman and 
E. S. Pearson [4]. This model seems unrealistic for fatigue observation 
because the exponential function implies that the expectation of 
future life is independent of the preceding number of cycles, i.e., of the 
“history.” This assumption is compatible with certain physical processes, 
such as radio-active decay, but not with fatigue. In fact all observations 
available on fatigue lead to estimations for ag which exceed unity. 
For any value of ag the median life Ns obtained from (2.1) is 


(2.4) Ws _ No,s = (Vs - No,s)(1n 2) 1/48, 


To obtain the modal life 5 consider the distribution p(N)s of the num- 
ber of cycles at failure obtained from (2.1) as 


QR p= 7)" | Gy] 
= exp | — (———— 
FE ° Vs—Nos 


Differentiation with regard to N leads to the mode 





(2.6) Ns — Nos = (Vs — No,s)(1 — 1/as)*/%s. 
A mode exists only for 1/as<1; 1/as’ <0.43429 and 
precedes 
(2.7) the mode jequals {| the median if 
exceeds 
> > 
l/as | = 0.30685]; 1/as’ | = 0.13326}- 
< < 


For as =3.25889 the distribution (2.5) is nearly symmetrical. Three 
other pseudosymmetrical cases will show up Jater. 

The density of probability at the mode increases with ag. This is 
shown in Figure 2 where (V—No,s)/(Vs—No,s) is used as abscissa. 
For constant values of No,s the median Ws and the mode Ws converge 
with increasing values of as towards the characteristic number Vs. 
It will be shown in paragraph 3 that the same holds for the mean Ws. 

The probabilities of survival 1/2 and 1/e at the median Ws and at 
Vs are fixed. The probability 1(N)s at the mode depends upon as. 
The median and the mode depend upon the three parameters Vs, No,s, 
ag and the same will be shown to hold for the mean, while the number 
of cycles Vz is itself one of the parameters of the distribtuion. Therefore, 
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Fig. 2. Influence of ag on the shape of the distributions of Ng. 





an estimate of the location parameter Vs is more convenient to char- 
acterize the fatigue failure of a given material than the mean, the mode, 
or the median. 

For the graphical representation of the numbers N at fracture the 
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reduced variate y defined in (1.3) is used. Equation (2.1) then becomes, 
in analogy to (1.6), 


(2.8) log (N — No,s) = log (Vs — Nos) + y/as’, 


where as’ and ag are related by (1.3). In the previous (linear) theory the 
parameter 1/az;’ is the slope of log N plotted against y. In the present 
theory it is the slope of log (N—No,s) against y. Within the previous 
theory as is independent of S, within the present theory it may depend 
upon S. If Vs is large compared to No,s the logarithms of the number of 
cycles at fracture traced on the extremal probability paper against y 
are practically linear as long as N is in the neighborhood of Vs. How- 
ever, if No,s is of the same order as Vg the curve is bent to the right 
(upwards) for N<Vs. For the same values of Vs and No,s, the asymp- 
totic value No,s is approached more quickly if 1/as becomes larger. 
Two curves are parallel if they have the same value of 1/ag and (Vg 
—No,s) although the parameters Vs and No,s may differ. 

The limiting co™:iition (1.5) leads from (2.8) to a number of cycles 


(2.9) N..s = No,s + (Vs - No,s)e?*!/4s 


for which 1(N.)s is practically zero. 

For a large number of cycles and small probabilities of survival a 
relatively small increase in the number of cycles considerably reduces 
the probability of survival. This holds for the linear and the general 
theory and corresponds to the popular statement of the straw that 
broke the camel’s back. For high probabilities of survival a consider- 
able decrease in the number of cycles is necessary in the linear theory 
in order to increase the probability of survival by a small amount, while 
in the general theory a small decrease in the number of cycles has a large 
influence on the probability of survival. 

It has often been assumed that the logarithms of the number of cycles 
to fracture in fatigue are normally distributed. However, in this case, 
the probability of survival converges in the same way to zero and to 
unity. Therefore the use of this distribution for extrapolation does not 
seem to be legitimate. 


3. ESTIMATE OF THE PARAMETERS 


Since the relation (2.8) between log N and y is no longer linear as 
was the case for No,s=0 the graphical estimate of the parameters given 
previously [2] is no longer feasible. Instead the classical method of 
moments advocated by Weibull [7] for the analysis of breaking 
strengths is used. The special case ag=1 was settled in (2.3). 
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In the general case as~1 the reduced moments of order k obtained 
from the distribution (2.5) are the gamma functions 


N — Nos \* 
(=) = [(1 + k/as). 


3.1 
vig Vs — Nos 


For k=1, 2, 3 three equations are obtained which may be used for the 
estimation of the three parameters. The mean, 


(3.2) Ns — Nos = (Vs — No,s)T(1 + 1/as), 


depends upon the three parameters and the probability at the mean 
depends upon as. The relation between the mean and the characteristic 
number of cycles is 


Ns 2 Vs if 1/as 2 BS 1/as’ 2 0.43429. 


For increasing values of ag the mean converges to the characteristic 
number Vs. The variance o?(N) 5 obtained from (3.1) and 3.2), 


(3.3) o(N)s = (Vs — No,s)?(P (1 + 2/as) — T2(1 + 1/as)], 


also depends upon the 3 parameters. The corresponding sample vari- 
ance s*(N)s=8s" is obtained by the usual procedure as 


(3.3’) 8s? = n(Ns? — Ns*)/(n — 1). 

The skewness +/§;,s defined by 

(3.4) VB1,s = us,s7s~* 

where u3,s is the third central moment is 

(3.5) VBi.s=[T(1+3/as) —3P(1+2/as)T(1+1/as)+21(1+1/as)] 
[T(1+2/as) —F*(1+1/as) }*”*. 


and depends only upon the parameter 1/as. If the population value 
/Bi.s is replaced by the sample value 


—o Vn(n — 1) N3'° — 3Ns?Ns + 2Ns* 
3.5’ = a 
( ) Vb1,s i on? (Ns? — Ns?)3/? 


an estimate of 1/as is obtained. To facilitate this procedure, the right 
side of equation (3.5) as a function of 1/as is given in Table 1, cols. 
4 and 1.! The value of 1/as’ in equation (2.8) is obtained then from 
(1.3). 

The two remaining parameters of location, the characteristic number 








1 This table was calculated by Gladys R. Garabedian of Stanford University. The authors take 
this occasion to thank her for this important contribution. 
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Vs and the minimum life No,s, are simple to estimate. Combination of 
(3.3) and (3.2) leads to 


(3.6) Vs = Ns + osA(as), 


where the standardized distance from the characteristic number to 
the mean 


(3.6’) A(as) = [1 — (1 + 1/as)][T(1 + 2/as) — F*(1 + 1/as) |-¥? 


is given in Table 1, col. 3. 

Since 1/ag is estimated from (3.5), the parameter Vs may be esti- 
mated from (3.6) after replacing the population mean and standard 
deviation by the sample values. The result can be checked from the 
observations traced on the extremal probability paper with the help 
of the first equation (2.2). 

To estimate the minimum life the value of Ws given in (3.6) is intro- 
duced in (3.2) whence 


(3.7) No,s = Vs — osB(as) 


where the standardized distance from the characteristic number to 
the minimum life 


(3.7’) B(as) = [T(1 + 2/as) — T°(1 + 1/as) |“? 


is given in Table 1, col. 2. For the estimation of No,s we use the previ- 
ous estimates of 1/as and of Vs and replace the population value og 
by the sample value ss. 

Thus the estimate @s is obtained from ~/b,,s, equation (3.5’), with 
the help of Table 1. The two other estimates are from (3.6) and (3.7), 


(3.8) Pe = Ns aa ssA(@s); No,s = Vs = ssB(Gs). 


The minimum life is thus estimated directly, without using iterated 
procedures. 

The result may be checked by another estimate based on the observed 
smallest number JN, of cycles at fracture. Its plotting position 1—1/(n 
+1) for n observations obtained from (1.7) and equation (2.1) lead to 
the unbiased estimate 


_ Min + 1% — v. 


(3.8) N'o.. = 





(n + 1)"@% —1 


This estimate is always smaller than the observed smallest number of 
cycles. Of course, this method also requires the previous estimate of the 
two other parameters 1/as and Vs. 
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TABLE 1 
ESTIMATION OF THE THREE PARAMETERS 








1 2 3 4 
Multiple of standard deviation 
Scale Reduced 
parameter for minimum life Nos for parameter Vs 3d moment 
1/as B(as) A (as) V Bi(as) 
equ. (3.7’) equ. (3.6’) equ. (3.5) 








01 78.9817 -4481 —1.0813 
.02 39.9890 -4461 —1.0249 
.03 26 .9862 -4439 — .9707 
04 20 .4808 -4416 — .9185 
-05 16.5744 -4392 — .8680 


.06 13.9673 -4366 .8191 
.07 12.1029 -4339 -7717 
-08 10.7024 .4310 -7258 
.09 9.6114 .4281 -6811 
.10 8.7369 .4250 -6376 


ll 8.0199 -4219 .5953 
12 7.4209 .4186 -5540 
13 6.9128 -4152 -5137 
14 6.4761 -4118 -4743 
15 6.0965 -4082 .4357 


16 5.7633 -4046 -3980 
17 5.4682 -4008 .8610 
18 5.2050 .3970 -3247 
.19 4.9686 -3931 -2891 
20 4.7549 -3891 .2541 


21 4.5608 .3850 -2197 
22 4.3835 -3809 .1858 
.23 4.2209 ‘ .1525 
24 4.0711 ‘ - 1196 
25 3.9326 : -0872 


26 3.8041 é .0553 
.27 3.6844 ‘ .0237 
28 3.5727 ‘ -0075 
29 3.4680 ‘ -0383 
30 3.3698 ‘ .0687 


31 3.2774 
32 3.1901 
-33 3.1077 
34 3.0296 
35 2.9554 
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1 2 3 4 
Multiple of standard deviation 
Scale Reduced 
parameter for minimum life No,s for parameter Vs 3d moment 
1/as B(as) A (ag) V Bilas) 
equ. (3.7’) equ. (3.6) equ. (3.5) 








2.8849 .3168 { (2455 
2.8178 3113 ‘(2741 
7537 .3069 3024 
6925 .3019 .3306 
.6339 . 2969 3586 


tS bb be 


5778 .2919 .3865 
.5239 . 2868 .4141 
-4721 ° .2817 -4417 
.4224 . 2766 -4691 
.3755 .2715 -4963 


Le ee) 


.3282 . 2663 5235 
. 2836 -2612 -5505 
. 2406 . 2560 .5775 
.1989 .2508 -6043 
. 1587 - 2456 -6311 


NNN dS 


.1196 . 2404 .6578 
-0818 2352 -6845 
.0451 2299 -7110 
.0095 . 2247 .7376 
-9749 -2195 .7640 


2 
2 
2 
2 
1 


.9412 .2142 - 7905 
-9085 . 2090 -8169 
.8767 . 2038 -8433 
.8456 .1985 .8697 
-8154 .1933 -8960 


— tt ht et 


-7859 1881 .9224 
7571 . 1829 -9488 
.7290 .1777 -9751 
.7016 1725 -0015 
.6748 .1673 .0279 


— et et et 


.6486 .1621 .0544 
.6230 .1570 .0808 
.5870 .1518 1073 
.5734 . 1467 - 1338 
.5494 .1416 . 1604 


—_ i it ht et 


.5259 . 1365 .1870 
.5029 .1314 .2137 
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1 2 3 4 
Multiple of standard deviation 
Scale Reduced 
parameter for minimum life No,s for parameter Vs 3d moment 
1/as B(as) A(as) V/ Bias) 
equ. (3.7’) equ. (3.6’) equ. (3.5) 








1.4803 . 1263 1.2404 
1.4581 .1213 1.2672 
1.4364 .1163 1.2941 


4151 -1113 .3210 
.3942 . 1063 .3480 
.3737 .1013 -8751 
.3535 .0964 .4023 
.3338 -0915 -4295 


.3443 .0866 -4569 
2952 .0818 .4844 
.2765 .0770 -5119 
. 2580 .0722 .5396 
. 2399 -0674 .5674 


. 2220 .0627 .5953 
-2045 -0580 .6233 
.1872 .0533 .6514 
.1703 .0487 .6797 
- 1536 .0441 . 7080 


1371 -0395 -7366 
. 1209 .0350 - 7652 
. 1050 -0305 .7940 
-0893 .0260 .8230 
.0738 .0216 .8521 


-0586 .0172 .8814 
.0436 .0129 .9108 
.0289 -0085 -9403 
.0143 -0042 -9701 
0 0.0 0 


-0766 2.6400 
.1364 3.3820 
.1797 4.2621 
-2081 5.3235 


- 2236 6.6188 
-1912 19.5849 
-1154 60.0917 
.0626 190.1132 








MINIMUM LIFE IN FATIGUE 587 


Finally, the theoretical numbers N corresponding to given probabili- 
ties (N)s are obtained from (2.8) as 
(3.9) N = Nos + (Vs — No,s)er'@. 
4. INFLUENCE OF THE PARAMETERS 
To show the influence of the parameters the values No= No,s = 1,000; 
1/a’=1/as' =0.1 and different values of Vs are chosen. In this simpli- 
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Fig. 3. Theoretical survivorship functions 1(N) s for constant 
Nos =1.0, 1/.s’=0.1 and various values of Vs. 
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fied scheme it is assumed that the parameter a and the minimum life 
No are both independent of S, while ag may and No,s will depend upon 
S. The numbers of cycles at fracture N in 1,000 obtained from (2.8) 


(4.1) log (NV — 1) = log (Vs —1) + 0.1 y 


are traced in Figure 3. The graph shows how the survivorship functions 
for different values of Vs converge to unity for a common value of N>. 
Within the observable part 0.0476 <l(N)s30.9524 for n=20 speci- 
mens obtained from (1.7) the survivorship functions look fairly linear 
and the slopes seem to increase systematically with decreasing Vs, i.e., 
with increasing stress levels, while in reality it was assumed that the 
minimum life is invariant against changes in S and that the parameters 
1/as are constant. Thus the graph may serve as a warning against 
relying too much on the graphical representation for the usual small 
samples. 

Since the normal distribution has sometimes been used in connection 
with fatigue observations, it is worthwhile to analyze under what con- 
ditions the distributions (2.5) look symmetrical. In addition to the 
pseudo-symmetrical case (2.7) where the median and the mode co- 
incide three other pseudo-symmetrical cases exist. Comparison of 
equations (2.4), (2.6), and (3.2) shows that the mean is equal to the 
median 


(4.2) Ws = Ns if 1/as = 0.29075, 


and that the mean is equal to the mode 
(4.3) Ws = Ns if 1/ag = 0.30189. 


Table 1 shows the existence of a fourth pseudo-symmetrical case. The 
third central moment is zero for 1/as=0.27760. It follows from Table 1 
that the distributions look symmetrical if the skewness is near to the 
interval 


(4.4) 0S VAs S 0.1. 
Then the scale parameter is near to the interval 
(4.5) 0.27 < l/as < 0.381; 3.2 < as < 3.7; 0.12 < 1/as’ < 0.13. 


The four values of ag are so close one to another that no practical 
distinction between these cases is possible. Of course none implies a 
normal distribution of the number of cycles at fracture. 

The values of Table 1 are drawn in Figures 4, 5, and 6. Figure 4 
shows the parameter 1/as and 1/as’ and the standardized distances 
from the characteristic number to the mean and to the minimum life 
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(4.6)  A(as)=(Vs—Ns)/os; Blas) = (Vs — No,s)/os 


as functions of the skewness +/f,,5. The distances decrease with in- 
creasing skewness. Within the interesting domain of 1/ag the distance 
from the mean to Vg is smaller than the distance from No,s to the 
mean. Figure 5 compares the standardized distance from Vg to No.s 
to the standardized distance from Ws to No,s. Both are traced as func- 
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Fig. 4. Parameter 1/as and standardized distances A(ag) and 
B(asg) as functions of skewness. (See Table 1.) 


tions of the skewness and of the parameter 1/as and 1/ag’. Finally 
Figure 6 which shows the standardized distance from Vs to No,s as 
function of 1/as and 1/as’ facilitates the estimation of the minimum 
life. 

The estimations Nos, Vs, and 1/as and the three statistics Ns, 
8s, Vb;,g are related by the sample analogs of the three equations (3.5), 
(3.6), and (3.7) which will now be analyzed. The estimation of the 
parameter 1/as depends only on the statistic 1/b,,s. The population 
value 1/as increases with the skewness as shown in Figure 4. In the 
previous (linear) theory this parameter was a function of the standard 
deviation of the logarithms of the number of cycles. In the present 
(general) theory it is a function of the skewness of the number of cycles. 

The partial derivatives of Vs and No,s with respect to each of the 
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three statistics Vs, 8s, \/b,,s, obtained from (3.8) lead to the following 
relations: For constant values of the standard deviation and the skew- 
ness the estimates of the characteristic number of cycles Vs and of the 
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Fia. 5. Standardized distances as functions of parameter 1/ag. (See Table 1.) 


minimum life No,s increase proportionally to the mean Ws. For con- 
stant mean and skewness and increasing standard deviations the esti- 
mates of the characteristic number of cycles increase for 1/as<1 and 
decrease for 1/ags>1 and the minimum life decreases. For constant 
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mean and standard deviation the estimates of the characteristic num- 
ber decrease and the minimum life increases with the skewness. For 
constant mean and skewness and increasing standard deviation the 
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Fig. 6. Estimation of sensitivity limit No,s. (See Table 1.) 


difference Vs—No.s increases. For constant mean and standard devia- 
tion and increasing skewness the difference decreases. 

The equations (3.6) and (3.7) lead to a new criterion which deter- 
mines whether the minimum life No,s vanishes or not, since 


No,s = 0 if Ns + 85(N)(A(@s) — B(@s) = 0. 
This condition may be written from (3.6’) and (3.7’) 
(4.7) Nos = 0; if Ns*/Ns? = T(1 + 2/as)/T%(1 + 1/@s). 
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The minimum life is taken to be zero, if the equality is fulfilled within 
the errors of random sampling. The same assumption must be made if 
the minimum life turns out to be negative. 

Finally the limiting number of cycles N.,s for which the probability 
of survival is practically zero, is obtained from (2.9), (3.7) and (3.6) 
as 


(4.8) Nas = Ns a os[A(as) + B(as) (e? 5! _ 1)]. 


It may be estimated by replacing the population values Ws and ag 
by the sample values and using the functions A(as) and B(as) given in 
Table 1. 

If the computed minimum life is longer than the smallest number 
of cycles at fracture, or if the largest observed number exceeds N,,s, 
it must be taken into account in the evaluation of such contradictions 
that the estimations of No,s and N.,,s are subject to considerable errors 
of random sampling, which so far are unknown. This holds also for the 
different criteria to be used to prove or disprove the existence of a 
non-vanishing minimum life. 


5. FATIGUE IN NICKEL AND ALUMINUM 
Table 2 summarizes the observed number of cycles to fracture at 


TABLE 2 
NICKEL.—REVERSED TORSION WIRE TESTS (RAVILLY [5]) 


Fatigue Life N, Thousands of Cycles 
Elastic Stresses S in kg/mm.? 
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different stress amplitudes for nickel. These data, taken from Ravilly’s 
observations [5] are traced on logarithmic extremal probability paper 
in Figure 7 using the plotting positions (1.7). 

Table 3 gives the estimations of the three parameters. The estimates 
for the minimum life No,s and the characteristic number Vs diminish 
with increasing stress level as might be expected. The estimated values 
of the scale parameters 1/as’ indicate that in most cases the mode ex- 


TABLE 3 
PARAMETERS FOR NICKEL WIRE 








Non-Linear Theory 





Stress Level 

(kg. per mm.*) + 15.5 + 18.0 +21.5 + 25.5 
Number of Spec. 20 20 20 20 
Mean 1,232.75 479.75 252.45 154.625 
Stand. Dev. 132.7193 34.7046 24.4228 17.0979 
Third Moment 2,861,621. 13, 422.2 2,524.58 —1,712.34 
Scale Parameter 0.31438 0.16791 0.14554 0.07604 
Char. Number 1,249.93 490.29 260.48 161.44 
Sensitivity Limit 1,051.63 396.13 185.56 70.24 





Linear Theory 





+25.5 +30 +33 437.5 +445 +49 +56 
20 


® 20 20 20 20 
log N 5.18664 4.99871 4.85573 4.69950 4.47015 
s(log N) 0.04850 0.03490 0.03594 0.03813 0.04118 
1/fas' 0.04564 0.03284 0.03382 0.03588 0.03875 
log Vs 5.21001 5.01558 4.87309 4.71793 4.49005 
Vs 162.21 103.65 74.66 52.23 30.91 





ceeds the median, a fact that contradicts the usual assumptions con- 
cerning the skewness of the distribution functions in fatigue. Within the 
linear theory, the estimates of 1/as’ do not show any systematic de- 
pendence upon S and the hypothesis that their variation is due to 
chance seems admissible on the basis of the considered tests. 

Figure 7 shows that for nickel tested in reversed torsion the linear 
theory gives an excellent fit for stress levels equal to or exceeding 25.5 
kg. mm.~*, for which, therefore, no minimum life appears to exist. For 
stress levels below 25.5 kg. mm.~*, however, the observations can be 
better fitted by the three-parameter survivorship functions which is 
bending upwards for decreasing numbers of cycles, indicating the exist- 
ence of a minimum life. For the sake of comparison the theoretical 
survivorship functions for the linear, two-parameter theory (1.1) and 
for the general, three-parameter theory (2.1) are both shown for the 
stress level 25.5 kg. mm.~. 
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It is interesting to note in Table 3 that the two estimates for V25,5 
are practically equal. The observed and the two theoretical survivor- 
ship functions in this case are also shown in Figure 8 in the conventional 
way, where the number of cycles is traced as abscissa and the survivor- 
ship function as ordinate, both in linear scales. Again, no preference 
can be given to either theory. 
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Fig. 8. Survivorship function for nickel at S = +25.5 kg. mm~. 
(See Tables 2 and 3.) 


Table 4 shows the calculation of the three parameters for Ravilly’s 
observations on aluminum wire for the three lowest stress levels. Figure 
9 presents the data given in [2] together with the theoretical curves 
obtained from Table 4. The survivorship function for the stress level 
5.75 kg. mm.~ clearly shows how the minimum life converges toward 
zero with increasing level of applied stress; if the minimum life is 
sufficiently low and the stress level sufficiently high the general (three- 
parameter) theory hardly differs from the linear one, since with in- 
creasing stress levels the minimum life approaches zero so rapidly that 
no distinction between the two theories is possible. 
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TABLE 4 
THE THREE PARAMETERS FOR ALUMINUM WIRE [2] 








Stress Level 


(kg. per mm.’) £5.5 +£5.75 





Number of Spec. 20 

Mean \ 
Stand. Dev. 3 177.1485 
Third Moment ’ ‘i 1,088, 203. 

Scale Parameter 0.14888 
Char. Number 7 609.76 
Sensitivity Limit Nes 781.47 76.75 





Tables 3 and 4 indicate that the 1/@s decrease with increasing stress 
levels for those stresses where the minimum life does not vanish. While 
similar relations have been reported by some investigators, other ob- 
servations show no such variation. The above relation cannot, therefore, 
be accepted as well established. Since all estimated values of 1/as 
are considerably below unity, the existence of an exponential distribu- 
tion of fatigue failures appears unlikely, for the reasons given above. 
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Fig. 9. Fatigue tests of annealed aluminum wire in reversed torsion. 
(See Table 4.) 





MINIMUM LIFE IN FATIGUE 597 


CONCLUSIONS 


The existence of an “incubation period” of fatigue, that is of a finite 
threshold number of cycles No,s below which, at a given stress level, 
fatigue failure will not occur and at which the probability of survival 
is therefore equal to unity has been established for certain metals and 
stress amplitudes. This phenomenon has also been confirmed by fatigue 
studies on both hard and mild structural steel at stress levels near and 
below their static yield stress. Therefore a “sensitivity limit” in terms 
of cycles (minimum life) appears to be as real an aspect of fatigue as 
the sensitivity limit in terms of stress (endurance limit), the existence 
of which is quite generally recognized. 

The present investigation is part of a research project on the basic 
aspects of fatigue, conducted at the Civil Engineering Research Labora- 
tories of Columbia University in New York, and is supported by the 
Research Corporation, the Higgins Fund and, in part, by the Office 
of Ordnance Research. 
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POINT ESTIMATES OF ORDINATES OF 
CONCAVE FUNCTIONS 


Cuirrorp HILDRETH* 
North Carolina State College 


A method is developed for obtaining maximum likelihood 
estimates of points on a surface of unspecified algebraic form 
when ordinates of the points are required to satisfy a set of 
linear inequalities. A production function with one variable 
input is considered in some detail. In this case the restrictions 
follow from the assumption of non-increasing returns. An illus- 
trative computation is worked out using a procedure based on 
equivalence between the estimation problem and a certain 
saddle point problem. Alternative procedures for production 
functions with two variable inputs are sketched. 


1. INTRODUCTION 


CONOMISTs are frequently in the position of having fairly strong 
iD presumptions that relations among variables with which they deal 
satisfy certain qualitative restrictions, but they seldom have very good 
grounds for saying that a particular algebraic form is appropriate for 
representing a given relation. Diminishing marginal productivity of 
inputs in production relations, downward slope of demand relations and 
homogeneity of certain demand and production relations are examples 
of properties which economists often assume. 

Unfortunately, statistical procedures available to economists typi- 
cally require that they completely ignore many of their a priori pre- 
sumptions in performing their statistical analyses or, alternatively, that 
they rather arbitrarily assume that a particular algebraic form satis- 
factorily represents the relation being investigated. In this paper pro- 
cedures are suggested for estimating points on a production surface of 
unspecified algebraic form from data on outputs produced by various 
combinations of inputs when the inputs are subject to diminishing re- 
turns. An example involving a single variable input is worked out as an 
illustration. More generally one could apply the approach presented 
here to a variety of situations in which an investigator knows some 
properties of a relation being studied but does not have sufficient infor- 
mation to put the relation into any simple parametric form. Such situa- 
tions are not uncommon so applications of the general approach may 
arise in several fields. However, the author’s closer familiarity with 





* The author is indebted to staff members of the Cowles Commission and to L. J. Savage of the 
Committee on Statistics, University of Chicago, for suggestions and critical comments. This paper will 
be reprinted as Journal Article No. 551 of the North Carolina Agricultural Experiment Station. 
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problems from economics makes it convenient to refer to this field 
when a more specific context is wanted. 

The procedures to be outlined should be regarded as supplementing 
rather than supplanting existing techniques. When an investigator is 
reasonably sure that a particular parametric form will satisfactorily 
represent the relevant properties of his relation, there will ordinarily be 
advantages of both efficiency and convenience in using the form. Meth- 
ods have also been developed for investigating particular aspects of a 
relation with only mild assumptions as to its form. Davis [4, Ch. 6] 
gives examples of smoothing devices that depend on local properties of 
a function. Robbins and Monro [13] have presented a stochastic method 
for approximating the value of a control variable associated with a 
selected mean response. 

Procedures for estimating the locus of an extreme value of a function 
and for approximating its properties in that neighborhood have been 
suggested by Hotelling [7] and extended and refined by Friedman and 
Savage [5] and Box and Wilson [2].! These seem particularly useful if 
the investigator is interested in a fairly small region about the extreme 
value and if he has either a pretty good a priori notion of where to find 
the extremum or has the opportunity to draw a fairly large sequential 
sample. lf some estimates of value of the function over a rather large 
region are needed, if fixed samples that cannot easily be repeated are 
the main source of information, and/or if there exists appropriate a 
priori information to be taken into account, then procedures like those 
suggested in what follows may have advantages. There is nothing to 
prevent an investigator from combining certain features of the present 
analysis with suggestions of Hotelling and the others if some opportu- 
nity exists to analyse certain initial data and then plan for the collection 
of additional observations. 

The reader will have a better basis for judging possible applications 
after the illustrative production analysis has been presented. Statistical 
problems arising from the generation of variables by simultaneous 
economic relations are not considered in the present discussion nor are 
types of statistical inference other than point estimation. However, 
conventional tests of hypotheses could be used in many of the contem- 
plated situations. 


2. PRODUCTION FUNCTIONS WITH ONE VARIABLE INPUT 
Consider a production relation of the form 
(2.1) y = ¢(z) +4 


1 A brief review of these procedures has recently been published by Anderson [1]. 
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where y represents output, z represents variable input and u is a random 
disturbance. In general z may be a vector with as many components as 
there are types of variable input but for the present we assume that all 
inputs except one are held constant, or as nearly constant as physical 
conditions permit. If inputs are defined broadly enough to include all 
factors influencing output, then variations in u may be attributed to 
unavoidable and unobserved variations in some of the constant inputs. 
It is assumed that variations in u approximate independent drawings 
from a normal distribution with zero mean and finite variance. 

An investigator is considered to have observations on y and z for N 
selected values of z. Let the values of z be arranged in increasing order 
and denoted by 21, 22, +--+, Zn, °***, 2n. For each level of input there 
may be several trials and corresponding observations of output. Let 
T,, be the number of trials at level of input z, and let y,; be the observed 
output for the fth trial at this level. We have 


(2.2) Ynt = O(Zn) + Une n=1,2,---,N 
t= 1,2,---,T7s. 


An economist with such data is principally interested in the possi- 
bility of drawing inferences about which levels of input are most 
profitable for various combinations of prices of output and of variable 
input (or conditions of demand for output and supply of input). Fre- 
quently such inferences have been drawn by assuming that the function 
¢(z) can be approximated by some given algebraic form with several 
unknown parameters to be estimated from the data. Estimates of the 
parameters are inserted into the form to obtain an estimated relation 
and this estimated relation is then used to calculate most profitable 
levels of input for chosen combinations of prices. 

The chief difficulty with this procedure is that the inferences often 
depend critically upon the algebraic form chosen. It is not uncommon 
to find that alternative forms fit the data almost equally well but have 
very different implications for the most profitable level of input. One 
example of this may be found in a study by Paul R. Johnson [8] which 
will be utilized further in Section 3. A recent article by Prais [12] 
emphasizes the critical importance of the form of equation chosen to 
represent a demand relation. This problem would arise more frequently 
in economic literature if it were given attention commensurate with its 
importance. Many economists uncritically accept the appropriateness 
of a functional form chosen largely on the basis of convention or con- 
venience; others try several forms for their relations but report only the 
one that in some sense “looks best” a posterior. 
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If the investigator knew the prices of y and z (or more generally if 
he knew the relevant demand relation for y and supply relation for z) 
and if these were expected to remain fixed during the period that his 
statistical results were to be used, then he could express net revenue as 
a function of z and regard this as the relation to be studied. If, in addi- 
tion, he could proceed to draw new observations of y, and therefore of 
net revenue, for chosen values of z, then he could apply the Hotelling 
(Friedman-Savage or Box-Wilson techniques could be used if z were a 
vector) procedure for estimating the point of maximum net revenue. 
If he wished the analysis to apply to various price (or demand and sup- 
ply) situations or if he needed to draw some inferences before obtaining 
new observations, then the above techniques would be difficult to apply. 

To compare all levels of input the researcher would, of course, have 
to make a very specific assumption about the form of ¢(z). However, 
in many studies this is not really necessary. If a reasonable basis can 
be found for comparing the profitabilities of levels of input for which 
data exist, such comparisons will often determine the optimal level of 
input to as close an approximation as the available data permit. The 
results are typically intended for use as a guide to actual producers. 
Conditions faced by these producers can never be exactly duplicated in 
experiments so there is always some error in transferring experimental 
results to commercial situations. If the data are gathered by surveying 
actual producing units, fundamentally the same problem will exist. 
There will always be some more or less relevant discrepancies between 
conditions faced by sample producers at the times they are surveyed 
and conditions faced by producers who ultimately use the results at the 
time their applications are made. Thus complete accuracy in the deter- 
mination of optimal input for the conditions represented by the data 
would always be superfluous even if it were possible. Furthermore, if 
comparisons among the observed levels of input are too crude when 
these considerations are taken into account, it is ordinarily possible to 
supplement the data and to obtain observations for additional levels 
in what is expected to be the relevant region. Indeed, it may frequently 
happen that an indication of the kinds of new observations that will 
prove useful may be one of the most valuable results of an initial exami- 
nation of production data. 

Let », be the expected value of output at input level z,. 


(2.3) an = $(2n) n=1,2,---, N. 


We seek to construct a reasonable procedure for obtaining estimates of 
the ,; these can then be translated into estimates of expected profita- 
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bility of the z, for any given price combinations. Our estimates will be 
derived by the method of maximum likelihood. However the same re- 
sults could be obtained by least squares and possibly other methods. 
Of course, if there were no a priori restrictions on ¢, then the maximum 
likelihood estimates of the ordinates n, would just be the means of ob- 
served outputs for the appropriate levels of input, i.e., 


1 Ie 
Zz; Ynt 


2.4 m2= 5), = 
(2.4) tin = 9 T. 





where the 7, might be called limited information maximum likelihood 
estimates.? To obtain full maximum likelihood estimates (here denoted 
by 7») the investigator must maximize the likelihood function subject 
to all of the a priori restrictions he feels justified in imposing. 

As was indicated earlier, in many production situations the investi- 
gator will feel that inputs are subject to decreasing returns. This is 
equivalent to assuming that ¢ is concave and yields the following re- 


strictions on the ordinate’— 
eo 


Nn+1 — In > Nn+2 — Nn+1 





(2.5) n=1,2---,N—-2. 


Zn+1 — Zn pe 2n42 —~ 2n+1 
For such cases it is desired to maximize the likelihood function subject 
to (2.5). The logarithm of the likelihood function is given by 


1 
(2.6) Ln, o*) = — T/2 log 2xo* — —— }) Dd) (Yue — 10)? 

20? n=l tml 
where T=) -*_, T, and 7=(mm- +: 7). Since o? is not restricted its 
estimator can be obtained by differentiation, yielding*— 


1 N Ta 
(2.7) 6? = — DD (Unt — Fn)? 

T n=l t=l 
where the 7, are those values which maximize L and satisfy (2.5); thus 
they minimize the double sum on the right of (2.6) under the restric- 
tions (2.5). We may write 





2 For an analogous use of this term in another context see Koopmans and Hood [9]. 

3 The investigator may typically feel justified in assuming strict concavity in which case the strict 
inequalities would hold in (2.5). However, if these restrictions are imposed, the likelihood function 
may not have a maximum but only a least upper bound. Maximizing the likelihood function subject 
to the restrictions given is equivalent to finding the least upper bound in the region defined by corre- 
sponding strict inequalities. 

4 This estimator may be expected to have a substantial downward bias when the 7;, are small. Ite 
distribution has not yet been investigated. 
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N Ta N Tr N 

(2.8) > % 7 (Ynt — Mn)? = * 2. (Yue — Jn)? + ts T (as — Ja)?. 
n=l t= n=l tel n=l 


The terms of the first summation over n on the right do not depend on 
7 and may be ignored. Let x be an N-dimensional vector with elements 
(2.9) In = Nn — Dn n=z1,2,---,N. 


The problem is then to minimize the weighted (by the 7,,) sum of 
squares of the z, subject to the requirements that the 7, satisfy (2.5). 
Equivalent restrictions on the z, are given by 




















pe ( Sell ) 1 gi 
—_ Cm La aad Ln = n 
Ani An+2 Anti . An+e pr Anti " 
(2.10) 
‘i ( a. )s aoe 
An+2 An+1 mee An+2 iealiee 


where Ani2=2n42—2n41, Angi =Zn41—Zn, and n=1, 2,---, N—2. 

It will simplify the discussion to state the problem in matrix notation 
(2.11) Find an ZeA such that Dz’ <xDr’ for all xeA where® 
(i) A is the set of all vectors x that satisfy A x’+b’20 and 





-1/sil 1 —1 
— { —+— — O «+--+. O 0 0 
Ae As Ae As 
—1 1 1 —1 
0 a —+—) — -:--- 90 0 0 
(ii) A = As As + As; As 
; -1 
0 0 0 0 —- (++ 
Ay-1 ‘Ova ‘AN 
(iii) b’ = AZ’ 
Ti 
T: 0 
i D= ° 
(iv) 0 
Tw 








5 In the special case in which the input levels are equally spaced and the same number of trials 
exist for each level we may take D =I and 


-1 2 -1 O-++ 0 O 0 

0-1 2 -1e-+ © 0 0 
A= . 

0 0 O Ore+=-1 2 =1 


since multiplication of D or any row of A by a positive constant does not change the problem. In this 
case, multiplication of a column vector by A yields the negative of the second differences of the elements 
of the vector. 
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The investigator can readily obtain D, A, b from his input-output 
data. We seek a way to compute 7 and thereby 7. Iterative procedures 
have generally been found most useful for this kind of computation. 
One might, for example, adapt a gradient method of minimization® to 
this problem or one might choose an arbitrary z satisfying the restric- 
tions and proceed to minimize the form zDz’ with respect to each com- 
ponent of z in turn holding the other components constant and observ- 
ing the relevant restrictions at each stage. It is argued in Section 4 that 
the latter process would, in fact, converge to the minimizing vector. 

However, for the particular problem stated in (2.11) it is possible to 
develop a more economical approach by using the fact that problems 
of extremization subject to inequalities have commonly been found to 
be equivalent to saddle point problems similar to those encountered in 
the Theory of Games. By a Theorem of Kuhn and Tucker [11, pp. 487, 
491-2] the minimization problem stated in (2.11) is equivalent to the 
following saddle point problem— 

(2.12) Find vectors 2, 3 such that 


¢(2, v) S $(2, 0) S O(z, 0) 
for all z and for all v=>O where 
(2.13) ¢(z, v) = zDx’ — v(Ax’ + b’). 


Some of the general methods that have been developed for minimax 
problems could doubtless be adapted to this case. However, the follow- 
lowing procedure seems exceedingly simple and is used in the example 
of Section 3. 

Since D is positive definite, ¢(z, v) has, for any v, a unique minimum 
with respect to x which may be found by differentiation. 


dg 
(2.14) — = 2Dz’ — A’v’. 
Ox 
Setting the derivatives equal to zero yields 
(2.15) 2’ = 3D"A'r’ 
and substituting into (2.13), 


(2.16) min ¢(z, v) = ¢*(v) = — 4vAD—'A'r’ — vb’. 


To find the non-negative N —2 dimensional vector 7 that maximizes 
this expression it is convenient to consider an equivalent minimum 


* See, for example, Chernoff and Divinsky [3, pp. 246-7]. 
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problem. Let 

(2.17) C= AD-'A’ and 

(2.18) O(v) = — 2p*(v) = FvCv’ + vd’. 


Clearly 3 minimizes 6(v). Since A has N —2 linearly independent rows, 
it may be noted that C is positive definite. The procedure to be fol- 
lowed in finding 0 is an iterative one. An initial value, say v = (v,v, 

-+ + vy-2), is chosen as a starting point for the iteration. Holding all 
components of v except the first fixed at their values given by v, tne 
non-negative value of v; which minimizes 6(v) is found. Call this value 
»,, 6(v) is then minimized with respect to admissible values of the sec- 
ond coordinate holding », fixed atv; and v3 to vy_s fixed at v3 to vy_.. 
This process of minimizing with respect to each coordinate in turn 
while holding others fixed at their last obtained values is continued 
until the desired degree of stability’ is obtained. 

Let »%., k=1, 2,---+, K, by a given component of v where, of course, 
K=N-—2. The procedure indicated above can be made more explicit 
by observing that the minimum of 6(v) with respect to any »; is either 
attained where v,=0 or where 00/dv,=0. If the latter equation yields a 
non-negative value for »:, then this is the minimizing value, otherwise 
v,=0 is the minimizing value. We note 


06 
(2.19) ths Cv’ + 2b’. 
wv 


At the pth stage of the iteration, we define w,” as the value of the 
kth coordinate that would be obtained by setting 06/dv,=0, i.e., 


k—1 K 

Cri Cri by 

(2.20) wm?) = — >) —v,”) — ia —y,e-) —~2— 
i=1 Ckk imk+1 Ck Ckk 


where the c,; are elements of C and k=1,2,---, K. 
The value of the kth coordinate of v at the pth stage of the iteration 
is then obtained by taking 


(2.21) vy,” = max (w,, 0). 
For the production problem described earlier it is convenient to start 


the process by setting v =0. The process then generates a sequence of 
vectors which we denote by {v™ }, ie., 





7 It is really a certain level of accuracy that is desired. In many iterative processes one intuitively 
associates this with the observed degree of stability. It would be worthwhile however to investigate 
circumstances under which the two may differ. 
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vy) = (y) Q O---0) 


vy = (py, yp Q--- 0) 


yp ‘*®) = (vy, v2) v3°)) oon vx) 


yp (K+1) = (v, v2 )) v3) oe vx!) 


() . 2 + yg) 


yp (PK+k) = (v, P+) es p, Pt) 


Vk+1 


etc. 

The function to be minimized, @(v), then defines a corresponding se- 
quence of scalars which we indicate by {6(v™) }. In Section 5 it is shown 
that the sequence {6(v™)} converges to a unique minimum. It will be 
seen that the proof depends essentially on the existence of a unique 
minimum, the boundedness of {v‘™}, and the continuity of @(v) and its 
first derivatives. This proof is deferred because some readers may wish 
to see the process illustrated and discussed before becoming involved in 
the mathematical details of the proof. Once i has been obtained, Z may 
be found from (2.15) and the estimates of the ordinates, the 7,, follow 
from (2.9). The distribution of these estimates has not as yet been 
investigated. In Section 3, the computing procedure just described is 
applied to illustrative production data. 


3. AN ILLUSTRATIVE COMPUTATION 


The primary purpose of the illustration is to show how the computing 
procedure developed in the previous section can be applied. While data 
from actual experiments are used, I have not examined the original 
reports of these experiments and do not have any firm judgment about 
the appropriateness of combining data from these various experiments 
in the simple analysis proposed here. For this reason I do not try to 
discuss the economic implications of the data but merely use them to 
illustrate a computing procedure. 

The data are taken from corn fertility experiments conducted at 
North Carolina State College. These have been summarized in a bulle- 
tin by Krantz [10]. Corn yields that resulted from various applications 
of nitrogen fertilizer are available. Paul R. Johnson [8] has used the 
results for fitting production functions under several alternative as- 
sumptions about the algebraic form of the function. Prior to his 
analysis Johnson made an attempt to select experiments that would 
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provide observations of yields under fairly similar conditions in all re- 
spects except level of nitrogen applied. Only results from plots with 
closely related soil types and from years of “good” weather were used.* 
Johnson’s data consisted partly of direct observations and partly of 
interpolated values. Only the direct observations are used in the present 
calculation. 


TABLE I 


FITTED EQUATIONS AND OPTIMAL INPUTS FROM 
JOHNSON STUDY 








Nitrogen Application to 


Fitted Equation Maximize Profit 





y = 4.504 (z2+20)°-59 5340 
y =25.16+.7595 z— .00209 z? 164 
y = 108 —82.48 (.9897)* 230 











y represents yield in bushels, z repr ts nitrogen in Ibs. 


In addition to approximating the underlying functional relation, 
Johnson was interested in the optimal application of nitrogen when 
nitrogen costs $0.137 per lb. and corn sells for $1.75 per bushel. While 
all of his equations fitted the data reasonably well, they differed sub- 


8 Good weather is attributed by Johnson to years in which “the rainfall distribution was about 
normal and soil moisture conditions were not low enough to cause the leaves to roll during this period 
(a five-week critical period including the time of tasseling).” In principle one should take account of 
the weather effect in the statistical specification if it is believed to have significantly affected the ob- 
served yields. If this were done and the weather and input effects were assumed to be additive, (2.2) 
would be replaced by 


(2.2) Yamt = an + bm + Unmet 


where m=1, 2, - * > , M is an index of the year of a particular observation and 5», is the weather effect 
in that year. Let 7'nm be the number of observations at input level zp in year m. If the Tnm are equal for 
all n and m, then the weather effect causes no significant complication. If we impose the natural require- 
ment that Dmdm =0, we have 





z = unmet zz = ynmt 
eS 8 


o~ 


(i) im = 





Z Tam Zz ZTam 


m 
from m=1, 2, > > +, M. The an can be obtained by minimizing 
Zz = ynmt 3 


m ¢t 
ii = > la - ——_- 
(ii) 8 . ” > Tan 


subject to the restrictions in (2.5) and the procedure developed in Section 2 applies directly. If the Tam 
are unequal, as in the present case, the situation is more complicated. It seemed undesirable to in- 
troduce these complications in the present illustration especially since an effort had previously been 
made to select homogeneous years. 
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stantially in their implications for the most profitable level of nitrogen. 
This is shown in Table I where the first column shows the relations 
with estimated parameters filled in and the second column indicates 
the corresponding level of nitrogen for highest profit. 

Unless one has considerable a priori confidence in the appropriate- 
ness of a particular algebraic form, there is considerable uncertainty 
about the implication of the data for the decision as to level of input. 
To illustrate the alternative procedure developed in Section 2, the ob- 
servations are first listed in Table II. 


TABLE II 
OBSERVED YIELDS AT SPECIFIED LEVELS OF NITROGEN 








NITROGEN 
(Ibs. /acre) 
(zn) 


80 120 180 


8 
§ 
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From these data, the following are readily computed 


(1 ) 
—_ 2 sak 
T; 


1 
| "Ty, 
(0370 0 0 0 
0 1111 0 0 
0 .1250 0 
0  .1000 


0 








0 
0 
0 
0 
0 
0 


0 
(—1 2-1 0 0O 
0-1 2-1 0 0 
0 0-1 2-1 0 
0 0 0-2 3-1 
0 0 0 0-1 2-1 
0 0 0 0 0-1 38 -2) 
The meanings of the restrictions are not changed if any row of A is 
multiplied by any positive constant. This sometimes makes it possible 
to choose convenient numbers for elements. The A above differs from 
that defined in (2.11) in that the above has been obtained by multi- 


plying the first three rows by 20 and the last three rows by 40. Con- 
tinuing 


0 
0 0 
0 0 
0 0 
0 








([— 5.24) 
30.53 
— 29.58 
45.45 
—14.03 
19.60, 


(3.3) 
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C=AD-1A’ 

( .6064 —.4722 .1250 0 0 0 

—.4722 .7111 —.4500 .2000 0 0 

.1250 —.4500 .6361 —.7333 .1111 0 

0 .2000 —.7333 1.4525 —.4385 .0526| 

0 0 .1111 —.43885 .4215 —.4052 


0 0 0 .0526 —.4052 1.4526) 








From these, formulas for the w;‘” are readily obtained. 


Wi (p) 


We?) 
w;°?) 


.7787v22-) — .2061v3°- + 17.28 
.66400,) + .6328v3°- — .2812u%°-) — 85.86 
— .1965v,) + .7074v2) + 1.1527n,2-) 


— .1746v;°-» + 93.00 

wy”) = — .1377m + .5048v; + .3019v,-» 

— .0362v,°- — 62.58 

ws?) = — .2636v3) + 1.0403, + .9613v.°-» + 66.57 
ws?) = — .0362u% + .2789v,) — 26.99. 


From (3.5) the values of v4‘ shown in Table III resulted when v was 
taken equal to the zero vector. It is recalled that 


(2.21) vy, = max (w,?, 0). 


TABLE III 
SUCCESSIVE VALUES OF »™ 








Sy 2 


3 





0 
0 


0 





0 








0 
0 


0 


0 











Applying (2.15) we have 
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i. 
0 
—5.33 
8.53 
—7.19 
2.32 
—2.20 
toe 


and recalling (2.9), the maximum likelihood estimates of the ordinates 
are given by 


#/ = 3D“1A'! = 








(22.94) 
41.58 
60.13 
67.34 
74.55] - 
84.47 
94.39 
194.01) 








These are shown together with the original observations and their 
means in Figure 1. 

While a comparison of profitability of various applications is sub- 
ject to the qualifications mentioned at the beginning of the section, it 
may be worthwhile to note that, at prices of $1.75 for corn and .137 for 
nitrogen, 160 lbs. would be better than the other levels according to our 
estimates. To get a good determination of optima! input for about this 
price ratio there should clearly be more observations in the 120-220 
lbs. interval, they should be more closely spaced, and weather effects 
should be taken into account (see fn. 9). Since economists are usually 
interested in the optimal production practices corresponding to a 
number of possible price situations, a useful interpretation of the results 
could be obtained by determining for each observed level of input those 
price combinations at which a particular level of input is most profit- 
able. Such a treatment has been illustrated for hypothetical production 
data by Hildreth and Reiter [6]. One consequence of estimating points 
on a surface, instead of estimating parameters in an equation assumed 
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to represent the surface, is that interpolation or extrapolation to parts 
of the surface for which no observations are available depends directly 
on judgment rather than on an initial assumption about the algebraic 
form. In many contexts this should probably be counted an advantage 
of form-free estimation. Experienced persons who may have useful 
notions of the approximate behavior of a surface may have difficulty 
visualizing all of the implications of a particular choice of algebraic 





120 


; 
# 





















































80 100 1a 140 160 180 
Lbs. of Nitrogen 


Fia. 1. Observations and Estimates. 


— Observed yield (yn:) 
xX Mean yield (4,) 
O Maximum likelihood estimate of expected yield (7) 


form. In addition, the form-free procedure allows necessary interpola- 
tions or extrapolations to be made at the stage of applying the results. 
At this stage various adjustments for possible discrepancies between 
experimental and commercial conditions need to be considered and 
advice of persons familiar with the circumstances of a particular appli- 
cation is likely to be available. 


6. PRODUCTION FUNCTIONS WITH TWO VARIABLE INPUTS 


Procedures similar to the one just illustrated could be applied to 
other problems of estimating an unknown coordinate of points on a 
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surface about which the investigator has qualitative a priori informa- 
tion, provided the a priori information could be translated into a set 
of linear inequalities (equalities would produce no complication and 
may be regarded as a special case) restricting the values of the unknown 
coordinate. It is likely that complications will sometimes arise in the 
translation of qualitative information into restrictions on the likelihood 
function. It may also be expected that different computing techniques 
will be required in some cases. 

The proof of convergence of the iterative process used in Section 3 
is seen in Section 5 to depend essentially on the existence of a unique 
minimum, the boundedness of {v’” } and the continuity of 0(v) and its 
first derivatives. These conditions would all be met if the iteration were 
carried out on the vector z of the original problem (2.11) instead of 
the vector v of the dual problem. We could choose a vector 2 = (a, 
22 +--+ zy) to start the iteration, find an element 2;°" which minimizes 
xDz’ subject to Az’+b’=0 with zz - - - zy held fixed at their initial 
values, then minimize xDz’ with respect to x2, etc. This would have 
been more cumbersome (three xz, would appear in each restriction and, 
except at the ends, three restrictions would have to be examined each 
time an z, were altered) than working with the dual problem in the 
case considered. However, in some problems it may be useful to con- 
sider an iteration on the original variables that enter the likelihood 
function. 

For example, consider the case of a production function with two 
variable inputs, say 


(4.1) Ymnt = (8m, Zn) + Umne Where 


8m —the m* level of one input 

Zn, —the n* level of the other input 

Ymne—Observed output of the ¢ observation with input levels s, 
and z, 

Umne—the value taken by the random disturbance on the ¢* trial 
with input levels s,, and zz, 


and 
m=1,2,---,M; n =1, 2, »>N; ¢=1,2,--+, Tan. 


To simplify the discussion we assuine that the 7’, are all equal (the 
case of unequal numbers of trials could be covered by substituting 
Tmn*Zmn*? fOr mn? in (4.9) below the modifying subsequent equations 
accordingly). Let 
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(4.2) ‘inn = (8m, Zn) and 


(4.3) Imn = Nmn — Jmn Where 
1 2m 
(4.4) Imn = pa Ymnt- 
Tan t=] 

If we assume diminishing returns to each input, then the maximum 
likelihood estimates of the n,, are obtained by minimizing the sum of 
squares of the zm, subject to the restrictions imposed by diminishing 
returns. Let X denote the M rowed, N columned matrix with typical 
element zm, and let Y be the MXN matrix with typical element Jmn. 
The restrictions expressing diminishing returns io the first input are 
given by 


(4.5) A(X + Y) 20 where 


A is of order (M—2) XM and is analogous to the A of Section 2; its 
elements are given by 
a;;=0 fort > j and fori <j — 2 
1 mst? 
a, = — ——— fort = j 
Sinn — 85 
1 1 a 
ay = aa for $e7-— 1 
Si41 — 8 8i42 — S8i41 
1 — 
“7; = —- —— fort = 7 — 2. 
Si+2 — Si41 





Similarly, the assumption of diminishing returns to the second input 
leads to the restrictions 


(4.7) B(X’'+ Y’) 20 where 
B is of order (VN —2) XN with elements 
bi; = 0 fort > j and fort <j —2 


1 . . 
j= - fort = j 
Zina — 2 
1 1 
+ 
Zita — % Zi42 — Ziq 
1 ha 
— ——_—_ fors = 7 — 2. 
2i42 — 241 





fort =j—1 
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The problem is to find a matrix X such that the sum of squares of its 
elements is a minimum subject to (4.5) and (4.7). In the notation 
already introduced we wish to minimize y(X) subject to the diminish- 
ing returns restrictions letting 


M WN 
(4.9) y(X) = SS Do tan? = tr XX’ 


m=] n=1 


where tr is an abbreviation for trace. 

This could be done by iterating on the elements of X subject to the 
restrictions (4.5) and (4.7). 

Alternatively one could consider the dual problem of finding the 
minimax of 
(4.10) y(X, V. W) = tr XX’ — tr VA(X + Y) — tr WB(X’ + Y’) 


where V is an N X(M—2) matrix of coefficients to be determined and 
W is a similar matrix of order M X(N —2). We seek the minimum with 
respect to X and the maximum with respect to V and W subject to 
V20, W20. 

As in the single input case, the minimizing values fcr X can be found 
by differentiation. 


(4.11) ae = 2X’ — VA — B’W’ = 0 
(4.12) X = 4(A'V’ + WB). 
Substituting this into (4.10) yields 
v*(V, W) = —3tr VAA'V’ —}3tr VAWB — 3 tr WBB’W’ 
— tr VAY — tr WBY’. 
To complete the analogy with the single input case we define 
OV, W) = — 2y* = $tr(VAA'V’ + 2VAWB + B’W’'WB) 
+2tr VAY + 2 tr WBY’ 


(4.13) 


(4.14) 


which is to be minimized subject to the nonnegativity of the elements 
of V and W. One could proceed to iterate for the minimizing elements 
of W and V. However, if N and M are very large this will be a long 
process and might be no easier than to compute the origina! problem 
of minimizing 7(X). It should also be noted that the proof of conver- 
gence in Section 5 does not apply to this case because the quadratic part 
of (4.13) is positive semi definite rather than positive definite. While it 
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seems a reasonable vonjecture that the computation would converge, 
this problem needs further investigation. In any problem in which the 
number of linear inequalities exceeds the number of variables in the 
likelihood function, the quadratic part of the expression to be mini- 
maxed in the dual problem will be positive semi definite and the con- 
vergence of the suggested iteration will have to be shown. 

These and other complications make it hard to foresee which kinds 
of computing arrangements are likely to be most generally useful. As 
experience indicates more exactly the kinds of relations and restrictions 
to which applied workers want to apply methods like those developed 
here, it will be useful to give more attention to this problem. 


5. PROOF OF CONVERGENCE 


In this section we wish to show that the procedure suggested in 
Section 2 leads in the limit to a unique minimum of the function @(v) 
for v in the closed positive orthant and therefore to estimates of ordi- 
nates which maximize the likelihood function in its restricted domain. 
We recall that {v™ } is a sequence of vectors in the positive orthant of 
a K dimensional Euclidean space, and that vt» is obtained from v™ by 
adjusting one element of v so as to minimize @(v) subject to the condi- 
tions that the adjusted element remain non-negative and that other 
elements retain the values assumed in y™, 

Recalling also that 


(2.18) 6(v) = 4vCv’ — Qvb’ 


where C is positive definite, we note that the sequence {a(v(™) } is non- 
increasing and bounded below, therefore it converges. 

In addition, it can be argued that @(v) has a unique minimum for v 
in the (closed) positive orthant. In the first place any minimizing point 
must lie in the intersection of the positive orthant and the ellipsoid 
6(v) <0(v™). Since this intersection is closed and bounded, a minimum 
is attained there. Suppose there were two minimizing points, say v* and 
v**, The line segment joining them would lie in the positive orthant and 
would also lie in the ellipsoid 6(v) [@(v*) = @(v**). Points in the intericr 
of this ellipsoid correspond to lower values of 6(v) than points on the 
surface, i.e. 6(v) <0(v*) for v in the interior of [v:@(v) $6(v*).] Unless 
v* =y**, the line segment joining them contains interior points and the 
supposition that v*, v** were minimizing points is contradicted. 





® The main features of this proof were suggested by Roy Radner. 
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We should next like to show that 
(5.1) lim | %® — %@+)| = 0 for all k. 


oe 


Let m=pK+k. Then in passing from v—» to v™, only the kth coordi- 
nate of v changes. Consider the following— 


6(v™-1)) om 6(v(™)) = Hix (VE)? a= vu, Pt))2 
(5.2) k-1 K 
+ (yE® — y%?tY)| Yaw?) + YS cows) + 2h |. 
tol t=mk+1 
Changing the superscript in (2.20) yields 


k1 Cri 


’ Kl Ck be 
(2.20 ) w, Pt) = _ yt) _— > = v;) +2—. 
iml Chk imk+1 Chk Crk 


Substituting into (5.2) we obtain 


Av») — O(y™) = = (v4? — vg ?t) (a, + vA O+Y) — 2u,t) 


(5.3) 
> o (v,) — v,(P+0))2, 
2 


To justify the mixed inequality we note that if w,°t 20 then v,°* 
=w,t) and the equality holds. If w+” <0, then »,°°+» =0 and (uv, 
—2w, +) >v, =0 and the mixed inequality holds. (5.1) follows from 
(5.3) and the convergence of {(v™) }. 

Let P,(v) be the vector obtained from v by holding all but the k* 
component fixed and minimizing 6(v) with respect to »%. P, is a continu- 
ous mapping of K dimensional Euclidean space into itself. In our 
sequence {v™} we have 


(5.4) v™ = P,,~x(v-)) where m ~ K means m modulo K. 


Let Onin =9(Umin) be the value of our function at its minimum for non- 
negative v. Let @,, be the limit of our sequence {6(v™)}. We should like 
to show that 0.,=@min. The vector sequence {v™} is bounded. In par- 
ticular the ellipsoid given by 6(v) =6(v) contains the ellipsoids given 
by.6(v) =6(v™) and thus bounds the sequence. {v™} therefore has at 
least one limit point and contains a subsequence which converges to 
this limit. Let v® be limit a of {v™} and let {v} be a subsequence ap- 
proaching v®. For each r identifying an element of the subsequence, let 
m(r) identify the same element in the original sequence. 
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From the continuity of 8, 
(5.5) 6.. = O(v”). 


We shall show that to suppose 0(v”) #@min involves a contradiction. 
If the supposition is true then it is possible to reduce 6(v) by changing 
a coordinate of v®, i.e. 


(5.6) aks P,(v*) ¥ v*. 

Let K be the set of all such k and let 

(5.7) e = min [6(v*) — 0(P,(v*)) }. 
kEK 


Let V be the set of all v in the convex set bounded by the ellipsoid 
6(v) =0(v°). 6(v) is uniformly continuous over VJ, i.e., 


(5.8) 35>03 |lv — v*|| < 5>| ov) — a(v*)| <e 


for all v, v* c V. 

We proceed to show that our original vector sequence, {v™}, con- 
tains an element P;(v™) within 6 of P;(v®) for a keK. It will then follow 
from (5.8) that | @(Px(v™)) —(Pi(v®)) | <e. Since 6(P;(v”)) is at least 
e below @,,, 0(P.(v™)) must also be less than 6... But since 6,, is the limit 
of 6(v™), this is in contradiction to the definition of {v™ }. 

From the continuity of the P, we know 


(5.9) Sp >03|\v — v|| < p> ||Pi(v) — Pi(v*)|| < 6 


for all k. 

From (5.1) we know that successive elements of {v™} can be made 
arbitrarily close together by making m sufficiently large. We also know 
that if r is sufficiently large, elements of the subsequence {v} lie 
arbitrarily close to v®. Specifically we may say 


p 


M M (m+1) — y(m) 
aMem> > ||» v i<o5; 


(5.10) 
p 


re. 


Now consider an 7 such that *>R, m=m(#)>M. The K elements of 
{v™} immediately following v™ are all within p of v®. At least one of 
them is obtained from the preceding by applying P; with keK. Such 
an element, P,(v+®) with ¢ an interger between 0 and K—1, lies 
within 6 of P,(v®) and reduces @(v) below @.,, i.e., 


SR3ar>R+>|\v™ — v>|| < 
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(5.11) || v +0) _ v®|| <p forsome0 Sis K-1 
such that v™+##1) = P,(v+*) for keK. From this and (5.9) we obtain 
(5.12) || Px(v +) = P,(v)|| <6 so that by (5.8) 
(5.13) | (P.(v*+®) — @(P,(v*))| <e. Then from (5.7) 
(5.14) O(Pi (vt) < 0. 
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APPROXIMATE DISTRIBUTION OF THE RANGE IN THE 
NEIGHBORHOOD OF LOW PERCENTAGE POINTS* 


Maurice H. Beuz,t University of Melbourne 
AND 
Rosert Hooks, Princeton University 


I. GENERAL 
1. Introduction 


HE investigations described below originated in a study of the up- 

per-tail probabilities associated with the distribution of the range of 
a small sample from a normal population. If the cumulative distribu- 
tion function for such a range is plotted on probability paper, as in 
Hald [3], it is observed that the curves (for different sample sizes) tend 
to become closely linear as the range increases. The same feature ap- 
pears to characterize the distribution of the range for small samples 
from other continuous populations, for example, the x? distributions 
for 2 and 4 degrees of freedom and the double negative exponential 
distribution; see Fig. 1. Attempts to discover the reason for this prop- 
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Fig. 1. Distribution of range of samples of 5 from various populations. 





* Prepared in connection with research sponsored, in part, by the U. 8. Office of Naval Research. 
t Research Associate, Princeton University, 1952. 
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erty have led to some general approximation procedures for estimating 
upper-tail probabilities for the distribution of the range of small sam- 
ples. 

If 211<2%2< +++ <2, represent the ordered members of a sample of 
size n, the range is z, —2. If this statistic is such that the probability of 
its being exceeded is low, say of the order of .05 or .01, the effect of 
Z, ON 2, might well be supposed to be small, whatever the value of n. 
This suggests, first, the investigation of an approximation based on the 
notion of complete independence of z; and 2. Secondly, one may at- 
tach to this the assumption that z, and z, are normally distributed, a 
result that is known to be approximately true, for small values of n, 
when the parent population is normal [6]. Results of the application of 
these procedures to small samples from various populations are re- 
ported below. 


2. Notation and Summary of Results 
The following notation is employed throughout. For a>0, 


P,(a) = Probability that the range exceeds a; 

P,(a) = approximation to the probability of the same event, based on 
the supposition that z; and z, are independent, with means 
and variances characteristic of their actual distributions. 

P;(a) = approximation to the probability of the same event, based now 
on the supposition that z; and z, are independently and nor- 
mally distributed, with means and variances characteristic 
of their actual distributions; 

R(a) = Pi(a)/P3(a); 

R= limp, (a) +0 R(a). 


It is shown that, in general, RF is not equal to 1, the value that might 
intuitively have been expected, so that the indicated approximations 
to Pi(a) are RP2(a) and RP;(a). For any distribution defined over a 
finite interval, R=(n—1)/n. For distributions having just one infinite 
tail, R lies between (n—1)/n and 1 and can, in fact, be as large as 1 
or as small as (n—1)/n. These results appear also to be true in the case 
of distributions with two infinite tails, though no proofs have yet been 
obtained. 

Numerical applications of the approximation procedures have been 
made to particular distributions, mostly with n=5 and with emphasis 
on those values of a which place P;(a) between .05 and .01, the inter- 
val of most interest in practice. The results are summarized in Table 1. 
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TABLE I 


TAIL PROBABILITIES FOR THE DISTRIBUTION OF THE RANGE 
(n=5 unless otherwise noted) 








Probabilities 
Approximate R(a) 
P;(a) P,(a) 

“0746 





2 








Rectangular 
f(z) =1 for OSzS1 
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Symmetrical Triangular 
Iz) =-(§ for 0Sz31 
2—z for 1 sz s2 
eta, parameters 2, 2 
fe > —2) 
or 0OSz31 








Chi-squa: d.f. 
f(a) se" =e for Osr<e@ 


Chi-square, 4 d.f. 
f(z) =ze* for 0 sz< @ 


Chi-square, 2m+2 d.f. n-1 
I(x Sine imit — 

for OSzr<@ n 

(n unrestricted) when m— © 


Hypberbolic d F 993 1 
f(x) =1/(1 +2)? d .0293 -996 1 
for Osr<@ (n=3) 


a8| 2ez| ao6 | ge 





i | gig | mdod0d | wod0de | inn 
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olan 











Hyperbolic 
(x) =2/(1+z)* 
forOsz<@ (n=3) 





Hyperbolic 
“(z) =3/(1+2)4 
forOSz<@ (n=3) 





Hyperbolic 

S(@) pe gy ps +z)™, 
m 22, for OS2z < © 
(n unrestricted) 


ay ~s a 
fiz) = je~l#! 


for o<z<@ 








Double Hyperbolic 
f(z) =}(m—1) 
atlzi". ma2, 
—2<r<e 
oe unrestricted) 


Standard Normal 
1 
I(x) =—— e-¥* (n =5) 
v2 





-0580 
-0440 


-0294 
-0121 
-0543 


-0273 
-O111 


-0542 
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-0166 
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xr 
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Poisson, mean 
p(z) =e! /z! 

















ni * Question marks indicate that, although FR is tak2n to b 0.8, this has not been established in 
is case. 
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3. Conclusions 


Table I shows that RP;(a) provides a fair to good approximation to 
P,(a) if P,(a) is near .05 or if the parent distribution is normal. When 
the parent population is unknown it therefore seems reasonable to sup- 
pose that RP;(a) will be a good approximation to P;(a) provided that 
the distribution of the parent is not too far from normal. Even when this 
condition is not fulfilled, the variety of examples in Table I indicates 
that the approximation is likely to be good if P;(a) is near to .05. 

The results of examples involving two-tailed distributions are con- 
sistent with the suggestion that R lies between (n—1)/n and 1. In all 
the cases examined the ratio R(a) was found to lie between this same 
pair of values. Failure to know the exact value of FR is not very serious, 
for even when n is as small as 5 the range of FR is only from .8 to 1, 
and whatever the value one takes for R in this range no error of any 
consequence is likely to be committed. 

It is shown in Section 7 that when the parent population is unknown 
and cannot be assumed to be normal, the use of the approximation 
RP;(a) is likely to be considerably better than that obtained by the 
direct method of examining observed ranges, when the amount of data 
is smal'. This is supported by some results obtained from drawings from 
various populations. 

The closeness of the approximations obtained for distributions de- 
fined over a finite interval as well as for the normal distribution sug- 
gests that the procedure might be usefully extended to contrasts 
other than the range. Tables of percentage points to cover quite simple 
contrasts are not now available, even for a normal parent, and some 
approximation procedure must accordingly be adopted. An investiga- 
tion of the contrast z,—4(2%1+22) has been carried out along the above 
lines for samples of 5 from four different populations, and in Section 
8 it is shown that the approximation RP;(a) is remarkably close to the 
true probability Pi(a) when this is of the order .05 to .01. The analyti- 
cal results are again supported by means of random drawings. 


Il. THE SUPPOSITION OF INDEPENDENCE—THEORETICAL RESULTS 
4. Expressions for P;(a), P2(a), R(a) 


Let f(z), F(x) be the density function and distribution function, 
respectively, and let (a, 8) be the interval over which z is defined. 
Then we have the following results where, for simplicity, u is written 
in place of 2; and v in place of zp. 


(1) P,(a) = Pr(v—u>a), (a > 0) 
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ne f n(n — 1)f(u){F(v) — F(u)}*-4f(v)dudn, 


(2) Pa(a) = f n4f(u) {1 — F(w)} FP -H(o)f(o)dudr, 


Pia) n-1 J JF (v) — F(u) }*-4f(u)f(v)dudv 


(3) R(a) = P.(a) 





, 


"f 1 = Fey} orptusptordude 


where w is the relevant portion of the half-plane for which v>u+a. 

In evaluating the integrals it is convenient to integrate first with 
respect to v, a distribution-free procedure, the terminals for the integral 
being u+a and £. The integration with respect to u is then from a to 
8—a. We find the following expressions, 


Pi(a) = 1 — {1 — F(6 — a)}* 
s—a 
— nf” sew {Fu + «) — F(a} du, 
P,(a) = 1 —- {1 — F(gp — a)}> 


(4) 


(5) ite 

— nf S(u) { 1 — F(u)}*—"'F*(u + a)du, 
with the aid of which most of the numerical values in Table I have been 
computed. 


5. The Limit R 


The assumption involved in computing P.(a), that z, and 2; are 
independent, is equivalent to assuming that 2, is contrasted with the 
greatest order statistic, y, say, of an independently drawn sample of 
the same size. When a is extreme, in the sense that the probability of its 
being exceeded is small, it might be expected that the expressions 
P,(a), P2(a) will have an interesting relationship. We are led to con- 
sider the limiting value of the ratio R(a) as a increases to its greatest 
value or indefinitely, as the case may be, that is, as Pi(a) and P2(a) both 
converge to zero. 

Consider, at first, the case where a, 6 are both finite. The region 
w is now the triangle bounded by the lines u=a, v=8, v=u-+a. On 
applying an appropriate mean-value theorem to each of the integrals 
appearing in (3), we obtain the form 
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iy PW) Fee} f sengteydude 





(6) R(a) = ’ 
Fo-a(om) {1 — Fur) o> ffu)fte)dudo 


where (u*, v*), (u**, v**) are points in w. If a is allowed to approach 
its greatest value B—a, the starred points both converge to the point 
(a, 8), so that each of the expressions in braces in (6) converges to 1. 
It follows that the limit of R(a) exists and is given by 


(7) oe, 





n 


This result, though contradicting the intuitive notion referred to 
above and which suggests the value 1 for R, is nevertheless distribu- 
tion-free and independent of a and 8. One might, therefore, be tempted 
to suppose that (7) would extend immediately to distributions for which 
one or both of a, 8 were not finite, on the grounds that truncation at 
a sufficiently remote point could have no appreciable effect on the dis- 
tribution. Such a generalization, however, is false, as shown by the 
results in Table I. 

We now sketch the outlines of a proof of the theorem that for one- 
tailed distributions the limit R exists and satisfies the double inequality 


n—l 


(8) 





sksl. 
n 


We may, without loss of generality, take the interval of definition as 
0<2< oo. On integrating (1) and (2) with respect to v and expanding 


the subsequent integrands by the binomial theorem, R(a) can be ex- 
pressed in the form 


n—1 ‘sib 1 
n> (° : ) (—1)#$,1-4,46 
inl t 


(9) R(a) = /- ’ 





where 


(10) or, = f “Gr(u)G*(u + a)j(u)du 
0 
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and 
(11) G(u) = 1 — F(u). 


In the successive terms of the sums in (9), with 7 increasing, the 
powers of G(u+a) that occur increase while those of G(u) do not. By 
considering the functions 


f "Gr(u)Ge(u + a)f(u)du 


(12) I, 9) = : ’ 
f G*(u)G*(u + a)f(u)du 
0 





f " Gr(u)Ge1(u + a)f(u)du 
(13) J ya = : ’ 


fo rtnarcu + a)f(u)du 





where 6 may have the value 0 or 1, it is readily shown that, uniformly 
in a for OSa<o, 
(14) lim I,,,@.9 = 0 


and that 
(15) J o,4% a< G(a)/ { G(z) } 8 
If 5=1 we may choose z such that G(z) =./G(a), and we thus have 


J Gr(u)G*"(u + a)f(u)du 1+ Ty e41 
(16) ~ = J, 2)- 1 +1 map 
f Gr+8(u)G*(u + a)f(u)du aoa 
0 








1+ | Peon) 
es ae 


where ¢ has the value 1 or $. On letting a (and z) tend to infinity, we 
reach the result 





< {G(a)}"- 


~ J n—2,1 
(17) om : de 


nN ao Pn-1 1 
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Since $,-1,1° <¢n-21., we establish the first half of the double in- 
equality, namely 


(18) 


Finally, on integrating ¢,-2:@ and ¢,1,,@ by parts, we obtain the 
relations 


G(a) — fe ngtu + a)du 


m—1 dn21 





(19) = <i, 
n A] 
G(a) -f G"(u)f(u + a)du 


Pn-1 1 


and the second of the desired inequalities, namely, 
R <1, 


follows at once. 

The result (8) is thus established. 

Attempts to prove the same result for two-tailed distributions have 
not been successful, although the numerical results obtained suggest 
that it is true in general. 


III. THE SUPPOSITION OF NORMALITY 
6. The Approximation P;(a) 


If we suppose that x, and z, are not only independent but also 
normally distributed, their difference z,—2, that is, the range, is also 
normally distributed and, knowing the parameters of this distribution, 
the probability that the range will exceed any given value a can be 
obtained by entering the standard normal tables. This approximation 
to P;(a) is denoted by P;(a). The required parameters are found from 
the mean and variance (when they exist) of the distributions of each 
of the end order statistics. When the density function f(z) is known, 
their determination is straightforward. For some of the distributions 
considered in Table I use has. been made of the tabulated results of 
Hastings, Mosteller, Tukey and Winsor [5] and of Godwin [2]; for 
others direct calculations were made. 

If yi, wn are the expected values and o;’, oc,” the variances of 2, 
Zn, respectively, the combined assumption of independence and nor- 
mality leads to the standard normal variate 
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(Zn Sai 1) —™ (itn = by) Aa 
Voi? + o a . 
The value of P;(a) is then given by 


(21) say. 





— (itn “_ oh 
Vo? + on" E 

The results obtained by the application of this formula are set out 
in Table I. 


(22) P;(a) = Pr ? > rm 


7. Application of the Approximation RP;(a) 


When samples are drawn from a given population, it is necessary to 
have some knowledge of the percentage points of the distribution 
of the range in order to decide whether the range obtained in any par- 
ticular case is significantly large. For this purpose we may have avail- 
able data from k independent samples each of size n. If the parent 
population is normal and an unbiased estimate of its variance is found 
from these samples, the required percentage points of the distribution 
can be found by referring to published tables of the “Studentized” 
range [4]. Since it is not our purpose to try to improve on this method, 
any application of the approximations discussed in this paper must be 
to (a) cases involving non-normal distributions, or (b) problems con- 
cerned with contrasts for which no tables are available. 

A survey of Table I shows that the approximation RP;(a) is likely 
to prove most useful for non-normal populations in the vicinity of the 
5 per cent point. Restricting ourselves to those entries in the table for 
which P;(a) is about .05, we note first that the error made in using 
RP;(a) in place of P;(a) is around 16% of P:(a) in the case of normal 
samples (15.7% for samples of size 5, 16.5% for samples of 8 and 16.8% 
for samples of 10), while for the others it ranges from 3.0% (symmetrical 
triangular) to 65.4% (one-tailed hyperbolic). It is of interest to compare 
these biases with errors of estimation that may be met in practical situ- 
ations. 

The direct way of using the k samples of n in the non-normal case 
would be to compute the corresponding k ranges and determine the 
quantile w.95, say, for this sample of k values. If this statistic is to be 
used for estimating the corresponding population quantile 2.9, we 
shall need to investigate an appropriate confidence interval for 2.95. 
The magnitude of this interval is indicated by the corresponding con- 
fidence interval for the parameter x in the binomial distribution 
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(*)ar*(1—7)*-*, where h is the number of “successes” observed in the 
sample of k, here the number of sample ranges greater than or equal 
to w.os. If k=180 and h=9, the 95 per cent. confidence limits for 
are .095 and .024, representing errors of 90% and 52%, respectively, 
if r=.05; if k= 100 and h=5, the corresponding limits are .112 and .017, 
representing errors of 124% and 66%; if k=60 and h=3, the limits are 
.140 and .019, representing errors of 180% and 80%; and if k=:20 and 
h=1, the limits are .249 and .001, representing errors of 398% and 98%. 
Hence the errors determined above for the approximation RP;(a) are 
exceeded by those obtained by using the direct method, based on the 
95 per cent confidence interval, when 100 samples of 5 are available, 
and are extremely likely to be exceeded when even as many as 180 
samples of 5 are available. 

It must be recalled, however, that the errors so far associated with 
RP;(a) have been computed on the basis of known means and variances 
for the distributions of z; and z,. When, as will happen in practice, the 
density function f(z) is unknown, the relevant parameters must be esti- 
mated from the observations themselves, and the use of these estimates 
in place of the true values will modify the error in approximating to 
P i(a). 

To investigate this effect, let us assume that the parent population 
is symmetrical and that we have available 20 independent samples of 
5 observations each, so that k=20. n=5. Denote the estimates of the 
means and variances of 2, 25 by #1, #; and s,’, s;”, respectively. No two 
of these estimates are strictly independent, but the dependence is 
probably not strong and, as we are interested here chiefly in rough 
comparisons, we shall assume independence. On the basis of the ap- 
proximate normality of distribution of the order statistics 21, 25, re- 
ferred to in Section 1, the standard error of #;—%; is ¢/+/10 while that 
of both s,? and s,2 is o2/\/9.5, o? being the common variance of 2° 
and zs. On replacing us—u by #;—#: and o;*+<;? by s:?+-s,?, we have, 
in place of (22), the approximation, P;*(a) say, given by 
— i =a 
where z is N(0, 1). The working approximation to P;(a) is now RP;*(a) 
where, for n=5, R=.8 for all distributions defined over finite intervals 
and may presumably be taken as .8 for all two-tailed distributions that 
do not depart widely from normality. 


In the special case of a normally distributed parent population we 
have, quoting from [2], 





(23) P;*(a) = Pr {: + 
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Ms = — m = 1.16296, oy? = o,? = .447535, 


giving, for a=3.86, Pi(a) =.0498 and P;(a) =.0525. If #,—%, overesti- 
mates ws— by one standard error, say, and simultaneously 2’, 
os? are each overestimated by one standard error, we find P;*(a) =.1122 
and .8P;*(a) =.0898. We have taken one standard error in each case 
rather than the customary two, which correspond roughly to the 95 
per cent confidence limits, but we have supposed that the errors ac- 
cumulate whereas they will often tend to cancel. Even though the 
result .0898 overestimates Pi(a) by about 80%, it compares quite 
favourably with the upper 95 per cent confidence limit obtained by 
the direct method. A similar procedure on the side of underestimation 
gives .8P;*(a) =.0099, as compared with .001, the lower 95 per cent 
confidence limit obtained by the direct method. 

The above conjectures have been tested by means of random draw- 
ings of samples of 5 from (7) the standard normal distribution, using 
the tables of random deviates given in [7]; (77) the rectangular distribu- 
tion f(z) =1 for O<zS1, using the five-figure tables of random num- 
bers given in [1] and [9]; and (77) the beta distribution for the values 
2, 2 of the parameters, namely f(x) =6z(1—-2) for OS273S, using five- 
figure tables of random numbers in conjunction with the ¢-tables for 
4 degrees of freedom given in [4] and subsequently transforming the 
variate values found. The results are collected in Table II, correspond- 
ing to the values 20, 60, 100 and 180 of & and to the value .8 of R. 
(In each case, the 180 samples are the compound of all the preceding 
ones.) The values of the relevant parameters are also included, corre- 
sponding to the value o of k. 

It is observed that for the normal and beta distributions the ap- 
proximation RP;*(a) is extremely good for all values of k. For the rec- 
tangular distribution it is only slightly less satisfactory. In only three 
situations is the difference between yus— and %,—, of the same sign 
as that between o1?+¢;? and s;’+-s,?. The greatest percentage error in 
P;*(a) relative to P;(a) on the side of overestimation is 46.7 (rectangu- 
lar for k=20) and on the side of underestimation 53.8 (beta for k 
= 20). In no case have errors as large as those conjectured above for 
the normal case (114% on the side of overestimation, 77% on that of 
underestimation) been approached. 

We conclude that the'use ‘of the approximation [(n— -1) /n)P;*(a) is, for 
samples of small size, preferable to the direct method when the number 
of samples available lies between 20 and 180. 
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TABLE II 
RESULTS OF RANDOM DRAWINGS—CONTRAST 25-2 








Standardized Percentage 
overestimation error in 
-8P3*(a) of P;*(a) 
relative to 
oto? P,(a) 








Rectangular , -8416 . 01555 . ° — .86 +46.7 
a=.92 ‘ 8354. -01710 ‘ ‘ —2.38 —14.9 
P,(a)=.0544 ° -8407 . -02060 ‘ — .48 +17.1 
-8Ps(a) = .0814 ; -8390 . -01869 : —2.14 +13.0 
-8333 . -01984 





Beta ‘ -7686 . -01150 ‘ —2.83 —53.8 
a=.78 . -7698 . -01603 , F + 8.1 
P,(a) = .0497 R 7434. 01422 d +1.66 — 5.0 
-8Ps(a) = .0576 ; -7550 01454 ‘ d — 5.0 
7607. -01565 





Normal 1.108. . 7307 J ‘ sd +35.8 
a=3.86 1.23... .3849 p — .06 + 9.3 
P;(a) = .0498 , S| ae .3822 é j d —35.8 
-8Pi(a) = .0420 / 1.144. .4176 4 ‘ —15.6 
1.163 .4475 .4475 























IV. EXTENSIONS TO OTHER CONTRASTS 
8. The Contrast 2,—}(21+ 22) 


The general procedure outlined above can be applied to contrasts 
other than the range. As a simple extension we shall consider the con- 
trast 2,—43(21 +22). The expression for P;(a) is obtained from the joint 
probability density function of the three order statistics written, upon 
integration over the region for which the inequalities z2>21, 2n>}3(2%1 
+22), 2, >22 are simultaneously satisfied. The integrand is here 


(24) n(n — 1)(m — 2)f(a1)f (a2) {F(an) — F(a2)}"-*f(an). 


Assuming that f(x) is defined over the interval (— ©, ©) and that the 
order of integration is z,, Z2, 2: in turn, the integral is most easily 
evaluated as the sum of two triple integrals for which the terminals 
are, respectively in order, 

(= + 22 


5 or ~), (x1, a + 2a), (— ©, 20 ) 


and (22, ©), (ai +2a, ©), (— ©, 0). When the density function is de- 
fined over a finite interval (a, 8), the integral is evaluated as a triple 
integral for which the terminals are, in order, 
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( Li + % 
2 


\ 


+ a, e), (1, 28 — 2a — 21), (a, B — a). 


In forming the approximation P2(a) we assume that 2; and z2 are 
correlated but are both independent of z,, leading to the integrand 


(25) n*(n — 1)f(as)f(a2) {1 — F (a2) }*-2F -'(an)f (an). 


When f(x) is defined over the finite interval (a, 8) it is readily shown 
that the ratio R(a) =P,(a)/P:2(a) converges to the limit 


(26) R = (n — 2)/n, 


as a converges to B—a. No attempt has been made to investigate bounds 
for R when f(x) is defined under more general conditions, although it 
is conjectured that the above limit is applicable to the normal case 
and that, in general, the upper bound is 1. 

In investigating the approximation P;(a), which is based on the as- 
sumption that the variates $(z:-+2) and z, are normally and independ- 
ently distributed, we require the means and variances of 2, Zo, 2p 
as well as the covariance between 2; and 2». If these parameters are, in 
turn, 441, 2, fn and o;?, 027, on”, o12, the present assumption leads to the 
standard normal variate 


_ [en = 4(t1 + 22)] — [on — Hm + m)] ; 
Vi(o1? + a2? + 2o12) + on? 
so that, for given a>0, 


(27) 














(28) P,(a) = Pr E ig da [un — 3(ur + oe) | |. 


Vi(01? + a2? + 2012) + on? 


The indicated approximations to P;(a) are thus [(n—2)/n]P2(a) and 
[(n—2)/n]P;(a). 

If the density function f(z) is unknown, use may be made of the last- 
written formula in approximating to P,(a), where the parameters ap- 
pearing on the right-hand side of the expression (28) for P;(a) are to be 
estimated. As before, we may have available k samples of n for this 
purpose. When the appropriate estimates are inserted, the correspond- 
ing probability may be denoted by P;*(a), and for distributions that 
are defined over a finite interval the working approximation to P,(a) 
then becomes [(n—2)/n]P;*(a). The same rule may be presumed for 
distributions that are normal or nearly normal. 

Some of the above considerations have been applied to samples of 5 
from populations previously considered, namely the rectangular dis- 
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tribution, the symmetrical triangular distribution, the beta distribu- 
tion with parameters 2, 2, and the normal distribution. 

For the first of these we have f(z) =1 for 0Sz3S1 and the integrals 
defining P,(a), P2(a) are readily resolved. To find P;(a) we used low 
moments given in [5]. 

For the triangular distribution f(z) =z for OSzS1 and f(z) =2—z 
for 1SzS2, the evaluation of the integrals tends to become tedious 
and, accordingly, P:(a) was not computed. For the beta distribution 
with parameters 2, 2, f(x) =6x (1—z) for OSz31, the evaluation of 
P,(a), though exceedingly laborious, was carried through for three 
values of a. 

For the standard normal distribution, P:(a) was evaluated by the 
method of double quadrature, using Simpson’s rule. Here P;(a) was 
first expressed in the form 


Pa) =1-20f *f fayate+w{raats+ dy 
— Fia+2+u)}*dudz 


and intervals of 0.2 were chosen for both u and z. 
The results are collected in Table III. It will be observed that when 
P,(a) lies between .01 and .05, RP;(a) affords an extremely close ap- 


(29) 


proximation to P;(a) for the triangular, beta and normal distributions, 
while for the rectangular distribution the approximation is excellent 
TABLE III 
TAIL PROBABILITIES FOR THE DISTRIBUTION OF 2;—4$(21+22) 








Distribution P,(a) P,(a) P;(a) R(a) R RP,(a) 





Rectangular ° -0507 -0704 -0947 -720 . -0568 
f(z) =1 for OS7 51 ° -0162 -0239 -0582 -678 . -0349 
-0103 -0154 -0500 -667 





Symmetrical triangular é -0503 
2 forO0s2z31 é -0105 


.0720 
fz) -(5 —z for 1s232 


-0228 





Beta, parameters 2, 2 3 .0621 .0907 
f(z) =62 (1 —z) .0452 
for OSz51 4 -0184 -0404 





Standard normal 
1 é -1004 .1221 

Ia) =——e-v2* . 4 .0815 
vie ’ .0464 .0519 

for —~ a <z<@ J -0094 -0102 


























* Question marks here have the same significance as in Table I. 
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when P;(a) is of the order of .05 and only slightly astray for smaller 
values of P,(a). 

The approximations P;(a) are based on known values of the param- 
eters in the distributions of the indicated order statistics. As men- 
tioned above, these parameters will, in practice, be estimated from the 
data provided by groups of samples. To investigate the effect of this 
procedure on the working approximation to P,(a), samples of 5 were 
drawn from the beta and normal distributions, in manners previously 
described, and the values of 2, 22, 25 were listed. For the beta distribu- 
tion, three sets of 20 samples were drawn randomly from an original 
lot of 100 randomly selected samples; for the normal distribution one 
set of 20 samples and four independent sets of 100 samples were drawn. 

The results of these drawings are collected in Table IV for the values 
of k mentioned, together with the values of Pi(a) and the working ap- 
proximation .6P;*(a) for selected values of a. For convenience in making 
comparisons, the table includes the values of the parameters them- 
selves and of the approximation .6P;(a), corresponding to the value 
co of k. 


TABLE IV 
RESULTS OF RANDOM DRAWINGS—CONTRAST 25 —}4(z:1 +22) 








Distribution 2 2: 2 6,2 822 82 a? -6P,*(a) 





Beta a=.68 a=.70 a=.75 





P,(.68) = .0621 . ° . ’ J ° A -0274 .0219 .0120 
P,(.70) = .0452 ° . . < A : A -0173 .0127 .0056 
P,(.75) =.0184 . ° ° ‘ A p A -0494 .0395 .0213 
0201 





0242 





Normal 





P,(3)=.1004 
Pi(3.4)=.0464 
P,(4) = .0094 

















In the first set of 20 drawings from the beta distribution the error 
in P;*(a) relative to P;(a) is, for each value of a, of the order of 50%; 
for the second set of 20 it is of the order of 73%; for the third set it is 
of the order of 10%; while the average of the three sets is of the order 
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of 44%. For the original set of 100 drawings the error is of the order of 
17%. 

For the normal samples, the set of 20 drawings leads to errors of 11% 
when a=3, 19% when a=3.4 and 33% when a=4. The means of the 
errors for the four sets of 100 drawings are 23% when a=3, 35% when 
a=3.4 and 62% when a=4. 

One set of 20 from the beta distribution yields low percentage errors 
while the other two have high values. The errors exhibited by the single 
set of 20 samples from the normal distribution appear to be exception- 
ally low. The first three sets of 100 drawn each display higher errors, 
the second being, perhaps, exceptionally high. The fourth set has errors 
almost identical with those of the independent set of 20. 

From these considerations it appears that about 100 samples of 5 
would seem to be required in order that the approximation .6P;*(a) 
should provide a reliable test criterion. 

As in Section 6, the direct method of procedure would lead to errors 
of 124% and 66%, corresponding to the 95 per cent confidence limits 
for when r=.05, k=100, h=5. Upon examination of Table IV we 
find that for the beta distribution, with k= 100 and a=.7, the percent- 
age error in .6P;*(a) relative to Pi(a) is 20% on the side of underesti- 
mation. For the normal distribution, with k = 100 and a =3.4, the errors 
vary from 12% on the side of overestimation to 45% on the side of un- 
derestimation. 

We conclude that the use of the approximation [(n—2)/n]P;*(a) is 
preferable to the direct method in the circumstances described. 


9. Concluding Remarks 


For any contrast it is possible to write down immediately the expres- 
sions for P;(a) and P,(a), the latter being based on convenient assump- 
tions of independence between the groups of order statistics that 
enter into the contrast. When, as in all practical situations, f(x) is 
defined over a finite interval (a, 8), the calculation of R is straightfor- 
ward. 

Thus, considering the contrast }(4%n1+2,) —}(a1+22) and assuming 
that the end pairs of order statistics are independent of each other in 
forming the approximation P2(a), we find, for any distribution defined 
over a finite interval, 


(n — 2)(n — 3) 
n(n — 1) 
while, for the contrast +,—4(21+22+23) we find, under like conditions, 


R 





, 
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n—-3 
R= 
n 


With the aid of such coefficients working approximations to the true 
probability Pi(a), involving the principle of normality, can be de- 
veloped generally by direct extensions of the methods described above. 
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OPTIMUM GROUPING IN ONE-CRITERION VARIANCE 
COMPONENTS ANALYSIS 
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National Bureau of Standards 


EsTs Of significance associated with the single criterion analysis of 
"TW eaouas usually assume that a sample of n observations is drawn 
from each of m normal populations with common variance o’. In the 
“components of variance” model, the m population means are them- 
selves considered a sample of m observations on a superpopulation, 
also normal, with variance @o*. The null hypothesis @=0 is tested 
against the alternative 6>0 by means of the F ratio. Detailed descrip- 
tions of this model are given by Eisenhart ([3] and [4]); Ferris, Grubbs, 
and Weaver [5]; and Crump [2]. 

When the number of populations is indefinite and the total number of 
observations N = mn is limited, it is possible to determine which (m, n) 
combination gives the most powerful F-test for a given N. This prob- 
lem was considered in [1], [4], and [5]. For all cases examined, each (m, 
n) combination in turn was found to provide the most powerful test 
for some interval of 6. 

An extensive list of these optimum groupings would seem, however, 
of limited practical interest; for the applied statistician is seldom in a 
position to specify in advance the magnitude of 6 which he desires to 
detect. To remove the need of a priori information, this note proposes 
a simple rule for selecting m and n which gives nearly maximum power 
for all 6>0.! The rule is to select m and n as nearly equal as possible. 
The operating characteristics obtained from this selection are shown 
in Figure 1 for m=n=8, 10, 12, and 16 when the test is conducted at 
the 5 per cent level of significance. The values used to plot the curves 
were obtained for the most part by the well-known method outlined 
in [5] using the percentiles of the F-distribution as given in [6]. Table 
8.4 in [4] also served as a valuable check on the results. The dashed 
curves enclose the operating characteristics for all the (m, n) combina- 
tions that can be formed from the given amount of data. These curves 
approximate the upper and lower envelopes for the family; no single 
(m, n) choice will yield either curve as its operating characteristic. 

Groupings in which m is much less than n are more powerful than 





* Now with Eli Lilly and Company. 
1 It should be emphasized that this paper deals wholly with significance tests rather than estimation. 
If one is interested in estimating @ or 6%e*, the suggested procedure may not be at all optimum, 
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those for which m is approximately equal to n for very small values of 
6, but in this range no grouping gives a test of sufficient power for prac- 
tical use. Choosing m much greater than n gives a slightly more power- 
ful test for very large values of @, but in this range the choice m =n gives 
power very close to unity. 
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Fig. 1. Operating characteristics of the F-test for testing 
6=0 against 6>0 at the 5% level of significance. 
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All communications concerning this section should be addressed to the 


Abstracts Editor, Professor George E. 


Nicholson, Jr., Chairman of the De- 


partment of Statistics, University of North Carolina, Chapel Hill, North 


Carolina. 


Anderson, T. W., “On estimation of param- 
eters in latent structure analysis,” Psycho- 
metrika, 19 (1954), 1-10. 

A new computational procedure for esti- 
mating the parameters in latent structure 
analysis is developed. The procedure has 
the advantage of avoiding the use of im- 
plicitly defined and unobservable quantities 
as well as being relatively simple computa- 
tionally. On the other hand the proposed 
procedure has the disadvantage of using 
only part of the available information and 
of using that part asymmetrically. A nu- 
merical example is worked out in detail. 

After reviewing the basic model of latent 
class structure, the author develops the 
rationale for the estimation method sug- 
gested as the basis for computation. The 
method deduces the proportion of the 
population in each latent class, and the 
probabilities of positive responses for each 
individual in the latent classes, from knowl- 
edge of probabilities of positive responses 
from individuals in the population as a 
whole. There are several possible choices of 
initial data for estimating the same param- 
eters—one may obtain differing estimates 
for the same parameters depending upon 
the particular initial choice made. If the 
latent classes are well defined, however, the 
range of the differences between equivalent 
estimates will be relatively small. B. J. 
Wier, University of North Carolina. 


Bartlett, M. S., and Rajalakshman, D. V., 
“Goodness of fit tests for simultaneous 
autoregressive series,” Journal of the Royal 
Statistical Society, (B) 15 (1953), 107-24. 
“The simplified method of derivation of 
Quenouilles (1947) goodness of fit test for 
autoregressive series with discrete time, 
. , is extended to the case of simultaneous 
autoregressive series . . . . The general solu- 
tion is applied to a particular first-order 
process in two variables.” G. L. Epaert, 
Virginia Polytechnic Institute. 
Beall, Geoffrey, and Rescia, Richard R., “A 
generalization of Neyman’s contagious dis- 
tributions,” Biometrics, 9 (1953), 354-86. 
A class of discrete distributions depending 
on three parameters, one of which is called 


n, is constructed. It is shown that Neyman’s 
contagious distributions of types A, B, and 
C are members of this class for n=0, 1, 2 
respectively. The calculation of the prob- 
abilities in these distributions require the 
use of recursive relations which are pre- 
sented here. The estimation of the param- 
eter n is a problem which the authors treat 
by fitting the frequency of cases with zero 
occurrences; the remaining two parameters 
are estimated by the method of moments. 
Several examples are discussed where it is 
shown that values of n other than 0, 1 or 2 
give better fit than is obtained with the 
distributions previously employed. LincoLn 
Mosss, Stanford University. 


Bechhofer, Robert E., “A single-sample 
multiple decision procedure for ranking 
means of normal populations with known 
variances,” Annals of Mathematical Sta- 
tistics, 25 (1954), 16-39. 

The problem of classifying normal popu- 
lations in two or more groups with respect 
to the ranking of their true means has been 
considered. The lower bounds for the prob- 
ability of correct grouping is obtained by 
considering the least favourable configura- 
tion of the means consistent with the given 
rankings. In the case of two groups, the 
least favourable configuration of the means 
is pfi]= +--+ =x] for the lower group and 
B[e—ty1]= ++ + ={x] for the upper group and 
the tables of probabilities of correct group- 
ing have been constructed for values of k 


and ¢ and 
‘ Bley] — ola] 


¢ 


where ¢ is the common standard deviation 
of the populations. These tables could con- 
versely be used to determine the sample 
sizes when one wants to ensure a certain 
lower bound of the probability of correct 
grouping for A>Ao. The method has been 
generalized to two-way classifications when 
the mean of the variable X;; is given by 
»+a;+6;, and the experimenter wants to 
pick up the population with highest a;, and 
highest 6;. Illustrations of the use of the 
tables have been given. M. N. Guosx, Uni- 
versity of North Carolina. 


640 





STATISTICAL ABSTRACTS 


Birnbaum, Allan, “Admissible tests for the 
mean of a rectangular distribution,” Annals 
of Mathematical Statistics, 25 (1954), 157-61. 

The problem of testing for the mean @ of 
a rec distribution has been con- 
sidered when the range is known and when 
the loss function is simple, i.e. assumes the 
values zero and one for correct and incor- 
rect decisions respectively. Using the fact 
that the minimum observation u, and the 
maximum observation v are joint sufficient 
statistics, the Bayes solutions are obtained 
in terms of functions of u and »v. Using the 
Neyman-Pearson Lemma, the most power- 
ful one-sided and two-sided tests are ob- 
tained from the class of Bayes solutions. 
M. N. Guoss, University of North Carolina. 


Box, G. E. P., and May, W. A., “A statisti- 
cal design for efficient removal of trends oc- 
curring in a comparative experiment with 
an application in biological assay,” Bio- 
metrics, 9 (1953), 304-19. 

It is desired to compare the dose response 
curves of two biological preparations. A 
, trend in time is expected to exist, which 
would ordinarily be confounded with the 
dosage response relations. An example in- 
volving four dose levels, each repeated for 
both drugs is given in full detail. The dose 
levels to be given on the eight occasions 
(equally spaced in time) are so chosen that 
linear and quadratic dose effects, linear, 
quadratic and cubic time trend effects, 
and interactions between drugs and each 
of the five named effects can all be esti- 
mated and all estimates be mutually 
orthogonal. Since the dose response curves 
turn out as parallel straight lines in the 
example, and the dose metameter is log 
dose, relative potency is estimated and a 
confidence interval is given. The last section 
of the paper deals with the method used to 
choose dose levels yielding the desired 
properties of the design. Lincotn Mosss, 
Stanford University. 


Chakravarti, N., and Bandyopadhyay, K. S., 
“A note on the consumption of cereal per 
adult unit in Calcutta,” Sankhya, 13 (1953), 
215-18, 


Survey information from a family budget 
inquiry conducted during 1950-51 by the 
State Statistical Bureau, Government of 
West Bengal, utilizing only the data for 
Calcutta, forms the basis of this note. The 
method of least squares is used to estimate 
the consumption of cereal in a given age or 
sex group of a random sample for the town 
of Calcutta, India. T. 8. Russeiy, Virginia 
Polytechnic Institute. 
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Chapman, Douglas G., “The estimation of 
biological populations,” Annals of Mathe- 
matical Statistics, 25 (1954), 1-15. 

This paper gives a systematic review of 
the various methods of sampling used for 
estimation of wild population. The mathe- 
matical models and assumptions are ex- 
plicitiy stated, which serves a very useful 
purpose of forewarning the consumers of 
these statistical methods about the pitfalls 
in this area. Most of these sampling meth- 
ods depend on tag-recapture technique, 
which could possibly be useful for estimat- 
ing fish population. Other interesting 
methods depending on size of successive 
samples developed by DeLury and others 
are also discussed. M. N. Guosu, University 
of North Carolina. 


Dwyer, P. S., “Solution of the personnel 
classification problem with the method of 
optimal regions,” Psychometrika, 19 (1954), 
11-26. 

The mathematical problem involved in 
personnel classification is shown to be a 
special case of the general mathematical 
problem of linear programming. After 
presenting numerical examples of variations 
in the general problem that arise in the area 
of personnel classification, the author points 
out that equivalent problems are encoun- 
tered in the Hitchcock transportation prob- 
lem, problems that arise in biometric clas- 
sification, and problems encountered in a 
zero-sum two-person game. Essentially the 
classification problem is that of finding co- 
efficients that maximize a linear form, sub- 
ject to a set of linear constraints. 

For most practical purposes special com- 
putational procedures rather than the more 
general mathematical solution seem most 
feasible. The conditions underlying the 
method of optimal regions as developed by 
the author are generalizations of those given 
earlier by Brogden. Computationally the 
method is an iterative one which basically 
moves hyperplanes parallel to successive 
positions in such a way that the optimal 
solutions, involving the number of points 
within the resulting regions, eventually 
satisfy the desired quotas. In most practical 
problems encountered so far, the method 
leads to a solution in relatively few itera- 
tions. It is recommended for use for prob- 
lems in which there are preassigned quotas 
and only a small number of categories. In 
application to a problem in which there 
were 1152 men and 7 job categories, a solu- 
tion was attained after eight iterations. 

This article provides an excellent and 
readable summary of work that has been 
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done in this area. B. J. Winer, University 
of North Carolina. 


Grubbs, Frank E., and Coon, Helen J., “On 
setting test limits relative to specification 
limits,” Industrial Quality Control, 10 
(1954), 15. 

Specification limits for a product should 
not be used as limits for determining the 
acceptability of the product if errors of 
measurement will be made in the testing. 
The paper shows how to determine test 
limits which satisfy several different cri- 
teria. If we let A be the chance of and C4 
the cost of accepting a non conforming 
piece and B be the chance and Cz the cost 
of rejecting a conforming piece, then the 
total expected cost of making wrong de- 
cisions is CaA+CpB. 

A general expression for the test limits 
which minimize this expected cost is pre- 
sented and tables of factors which deter- 
mine the limits are given for C4=Csp, 
Ca=2Cp, and for A=B. 

A major conclusion of the paper is that in 
most cases the test limits should be outside 
the specification limits unless C42=6Cp. In 
the analysisit is assumed that the product 
quality and the measurement errors are 
normally distributed with known variances. 
Assert H. Bowker, Stanford University. 


Gumbel, E. J., “The maxima of the mean 
largest value and of the range,” Annals of 
Mathematical Statistics, 25 (1954), 76-84. 

In this article the author generalizes 
some earlier results obtained by both 
Plackett and Moriguti. He shows that the 
maximum of the mean range calculated by 
Plackett holds for any continuous variate 
possessing the first two moments. 

The mean and the standard deviation of 
the largest value and the mean range are 
given for a distribution where the mean 
largest value is a maximum, and for another 
distribution where the mean range is a 
maximum. 

The asymptotic properties of the reduced 
values for the two distributions are com- 
pared. 

Graphs for probability and density func- 
tions which maximize the mean largest 
value (n=2,3,4,5) and for the mean 
largest reduced values and mean ranges as 
functions of n are drawn. A. E. SARHAN, 
University of North Carolina. 


Guttman, L., “Image theory for the struc- 
ture of quantitative variates,” Psycho- 
metrika, 18 (1953), 277-96. 


A new structural model which purports 


to avoid problems of indeterminancy in- 
herent in the factor analysis model is pre- 
sented. Within a universe of variates, each 
variate is partitioned into a part that is 
predictable by linear multiple regression 
from the other variates (a common part) 
and a part that is not predictable (an alien 
part). Sampling of variates within this uni- 
verse defines partial images for each yariate 
(i.e., predicted values from n—1 remaining 
variates); the non-predictable parts are 
called partial anti-images. The author con- 
tends that this partition is unique, whereas 
the factor analysis model provides no 
unique definition for the partitioning of the 
variates. The factor analysis model is said 
to be determinate only if it reduces to the 
image analysis model as the size of the 
sampling from the universe increases. 
Under the partition made by image 
analysis, the correlation between two vari- 
ates can be expressed as the difference be- 
tween the covariance of the partial images 
and the covariance of the partial anti- 
iniages. Using this identity computational 
procedures for structural analysis of inter- 
relationships are developed. It is pointed 
out that in the special case of determinate 
common factors, non-diagonal covariances 
of partial anti-images must approach zero. 
B. J. Winer, University of North Carolina. 


Harman, H. H., “The square root method 
and multiple group methods of factor 
analysis,” Psychometrika, 19 (1954), 39-55. 

Multiple group methods of factoring cor- 
relation matrices have been used for some- 
what varying purposes by various authors. 
By some it is considered as a convenient 
method for extracting factors, by others 
it is considered as a means for locating 
reference vectors in accordance with a pre- 
determined hypothesis. The author at- 
tempts to integrate the various approaches 
to multiple group factoring methods and 
develops a systematic notation for these 
methods. 

Phases of the computational procedures 
are shown to be simplified by application of 
the square root method for solving sets of 
simultaneous linear equations. Basically 
the square root method is a computational 
technique for factoring a matrix into two 
triangular matrices. Detailed steps in the 
numerical application of the square root 
method are given. Its application to mul- 
tiple group factor analysis, computation of 
inverses, and regression analysis in general 
are clearly and compactly presented. For 
many purposes the square root algorithm is 
shown to be superior to the Doolittle al- 


se of 24 wet 


- i aa ee le a i, ee ee ee ee ee 


pa oe es ee ee 





STATISTICAL ABSTRACTS 


gorithm. B. J. Winer, University of North 
Carolina. 


Hartley, H. O., and David, H. A., “Uni- 
versal Bounds for Mean Range and Ex- 
treme Observation,” Annals of Mathemati- 
cal Statistics, 25 (1954), 85-99. 


The range, mean range, and other recent 
technioues of short-cut analysis of variance 
are used in industrial quality control under 
the assumption that the basic distribution 
is normal. For example, an observed range 
may be converted to an unbiased estimate 
of the standard deviation by multiplication 
with a certain constant. The authors con- 
sider to what extent this estimate is biased 
when the basic distribution is not normal. 
The problem of establishing upper bounds 
for E (range) /o has been considered by both 
Plackett and Moriguti. They tabulated 
upper bounds but little was done with the 
lower*;bound. The authors show that 
Moriguti’s solution, which is confined to 
finding an upper bound for symmetrical dis- 
tributions, applies in general. Also, the 
authors derive universal upper and lower 
bounds for the ratio EF (range)/o for any 
f(z) in which aSz3Sb, (a and b are con- 
stants). Universal upper bounds are given 
for E(X,)/o for the case where X is finite 
and X, is the largest sample element. A. E. 
SarHAn, University of North Carolina. 


Jowett, G. H., and Scott, J. F., “Simple 
graphical techniques for calculating serial 
and spatial correlations and mean semi- 
squared differences,” Journal of the Royal 
Statistical Society, Series B, 15 (1953), 81- 
86. 


Two techniques for getting the approxi- 
mate value of the serial correlation are 
described. The value of the Mean Semi- 
squared Difference (MSSD)=4 Average 
(aj—2i4s)? is determined by a “tracing” 
method and a “transparent scale” method. 
The approximate value of the serial correla- 
tion is then [Average (2; — #)?— MSSD]/Av- 
erage (z;—%)*. The graphical methods com- 
pare favorably with direct use of a desk 
calculator. While they are less accurate, 
they are less tedious, can be carried out 
with simple equipment, and the tracing 
method is actually much quicker when 
statistics are required for a large number of 
logs. Paut N. Somervitue, Virginia Poly- 
technic Institute. 


Kamat, A. R., “Some properties of esti- 
mates for the standard deviation based on 
deviations from the mean and variate differ- 
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ences,” Journal of the Royal Statistical 
Society, Series B, 15 (1953), 233-40. 

Estimates of the relative variances of Do 
and 2pr are given, where 2». and Dyr are 
defined as the unbiased estimates of o? 
based respectively on the pth power of the 
deviation from the mean, and the pth power 
of the absolute values of A’z;, the rth order 
variate differences. The relative variance 
is defined as the square of the coefficient of 
variation (Fisher). Putting Spr= (Z pr), it 
is shown that for large samples the relative 
variance of S2, is smaller than that of Sp, if 
p #2 and that the relative variance of S3; is 
less than that of Sir. Some results are given 
for small samples. If the values in the 
sample have a slight linear trend in their 
means, it is shown that 21, and 2»; have a 
much smaller bias than 210 or 220. Further, 
the increase in their variance is much 
smaller. If wpr= Zpr/s? where s?= 220, then 
it is proved that the kth moment about zero 
of wpr is equal to the ratio of the kth mo- 
ments about zero of the numerator and 
denominator of wpr. Pau N. SoMERVILLE, 
Virginia Polytechnic Institute. 


Kempthorne, O., and Tischer, R. G., “An 
example of the use of fractional replica- 
tion,” Biometrics, 9 (1953), 295-303. 

It was desired to study the effect of 
various factors on the acceptability of de- 
hydrated corn. The relevant factors selected 
for study in an exploratory experiment 
were: varieties (8), date of harvesting (4), 
blanching condition (2), temperature of de- 
hydration (2), temperature of storage (2), 
length of storage (2). In addition it was 
considered desirable to use 4 blocks rather 
than a completely randomized design. A 
complete experiment would thus require 
2048 = 2" plots. Both blocks and varieties 
could be represented as pseudo-factors in a 
2* factorial design, so a } replicate was 
chosen which resulted in no 2-factor inter- 
actions being mutually confounded. In 
addition the design had some features of a 
split-split-block design. 

The rationale for selection of the design 
is fully presented and the analysis and re- 
sults are sketched. Lincotn Mosszs, Stan- 
ford University. 


Lancaster, H. O., “A reconciliation of x?, 
considered from metrical and en:umerative 
aspects,” Sankhya, 13 (1953), 1-10. 

The relationship between the x? of the 
goodness of fit test and the deviations of the 
moments from their expected values is con- 
sidered by use of orthogonal transforma- 
tions. These orthogonal transformations 
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yield alternative proofs of the distribution 
of x? used in the goodness of fit without re- 
quiring the use of Stirling’s approximation. 
The method of this paper is a generalization 
of the identification of the x? used in ap- 
proximating to the probability of obtaining 
exactly m successes in n trials with con- 
stant probability with the square of a 
standardized normal deviate derived from 
the consideration of n samples drawn from 
&@ population where the variable can take 
two values, 0 and 1. T. 8S. Russe, Virginia 
Polytechnic Institute. 


Lindley, D. V., “Statistical inference,” 
Journal of the Royal Statistical Society, 
Series B, 15 (1953), 30-76. 

The paper is concerned with analysis of 
experiments and the point of view taken is 
that the purpose of experimentation is to 
enable one to decide between certain 
courses of action. Kolmogorov’s axiomatic 
theory of probability is used and Wald’s 
formulation of the statistical decision prob- 
lem is adopted, but not in full generality. 
Two simplifying assumptions are made, 
namely, 1) decisions on experimentation 
have been made and the only decisions re- 
maining to be made are terminal decisions; 
2) both the class of probability distributions 
and of decisions are finite. Under these 
simplifying assumptions it is established 
that there exists a class of decision func- 
tions which is, in some sense, optimum. The 
concept of minimum unlikelihood is in- 
troduced and with it is constructed the 
optimum class of decision functions. Conse- 
quences of previous results are discussed as 
well as what must be specified in order that 
meaning be given to the phrase “the best 
decision function.” Applications of the 
minimum unlikelihood method to some 
common statistical problems are given. 
Of particular interest is the result that z is 
the minimum unlikelihood estimator of the 
mean of a normal population for a wide 
class of weight functions. A discussion of 
the paper is included. F. 8. McFeresty, 
Virginia Polytechnic Institute. 


Loevinger, J., Gleser, G. C., and DuBois, 
P. H., “Maximizing the discriminating 
power of a multiple-score test,” Psycho- 
metrika, 18 (1953), 309-17. 

Starting with a heterogeneous pool of 
items one frequently encounters the prob- 
lem of constructing subtests in a manner 
that maximizes the correlation of items 
within subtests and minimizes the correla- 
tion between subtests. A method which 
seeks to maximize the ratio of inter-item 
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covariance to total variance within each 
subtest is developed for this purpose. This 
ratio is defined to be the saturation of the 
subtest. 

The nucleus for subtest 1 is formed from 
three items having high intercorrelations. 
In Cycle 1 those items in the pool which, 
when added to the nucleus, lower the 
saturation of the subtest are eliminated 
from further consideration in Cycle 1. Of 
the items that do not lower the saturation, 
that one is selected that maximizes the 
saturation. This process is continued at each 
stage with the augmented nucleus until the 
pool of items is exhausted. The construc- 
tion of subtest 2 starts with a new nucleus 
of three items; Cycle 2 follows the same 
pattern as Cycle 1. Computationall; several 
cycles can be carried out simultaneously. 

Maximizing the saturation of a subtest 
drawn from a finite pool of items will not 
necessarily result in a battery having maxi- 
mum discrimination power (minimum be- 
tween subtest correlation). The conditions 
under which maximum discrimination is 
not achieved by the method are given. 
These conditions are not in general over- 
restrictive for practical use of the method. 
B. J. Winer, University of North Carolina. 


Moran, P. A. P., “The random division of 
an interval—Part III,” Journal of the 
Royal Statistical Society, Series B, 15 (1953), 
77-80. 

This paper is a sequel to two previous 
papers on the subject by the same author. 
The distribution of the sum of the squares 
of the intervals into which the line is 
divided is further considered. The lower 
5 per cent and 1 per cent points of the 
distribution can be found exactly up to 
n=8 and n=9 respectively (nine and ten 
intervals). Beyond about n=20 an ap- 
proximation based on the variance ratio 
distribution is probably adequate. For 
values of n between 9 and 20 no workable 
formula has been reached. FRANKLIN 8. 
McFEE LY, Virginia Polytechnic Institute. 


Nath, Pran, “O. C. curve simplified,” 
Sankhya, 13 (1953), 35-38. 

It was first pointed out by Barnard that 
most O. C. Curves for fraction defectives 
were sufficiently well represented by a 
straight line, if drawn on logarithmic 
probability paper using the logarithmic 
scale for p. In this paper the author in- 
dicates that in his experience Barnard’s 
method has not been satisfactory and he 
presents a method using “harmonic prob- 
ability paper.” Harmonic probability paper 
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is described and an illustration of its use 
is given. The author asserts that more 
satisfactory results are obtained using his 
method but gives no mathematical ex- 
planation. Lzo Lyncu, Virginia Poly- 
technic Institute. 


Ottestad, Per, “On the analysis of variance 
of percentage fractions,” Skandinavisk 
Aktuarietidskrift, 35 (1952), 152-59. 

This paper discusses weighting methods 
for the analysis of variance of percentages 
based on possibly unequal numbers of 
trials. Random variation of the binomial 
parameter and of the numbers of trials are 
taken into account. After some general 
comments on the meanings and limitations 
of the process, the author develops a re- 
gression method for obtaining the weights. 
Particular emphasis is given to the case in 
which number of trials is a random variable 
independent of the binomial parameter but 
dependent on classification. In this case he 
suggests that the weights should be found 
by estimating the (linear) regression of 
class variance for percentage as a function 
of the expected value of the reciprocal of 
sample size; the weights are then taken as 
the reciprocals of the appropriate values of 
the estimated regression function. An ex- 
ample taken from Cochran’s article on the 
same topic in the Journal of the American 
Statistical Association 38 (1943), pp. 287- 
301, is worked out in detail. Finally the 
method is extended to cover the case of 
multiple classifications. Seymour SupMAN, 
University of Chicago. 


Peto, S., “A dose response equation for the 
invasion of micro-organisms,” Biometrics, 
9 (1953), 320-35. 

Where a dose can be represented as an 
integral number of units (say invading 
micro-organisms) and response is a yes-or- 
no event such as death, a one parameter 
model can be offered—viz. probability of 
survival equals e~”? where p= is the unknown 
parameter. This can be offered as an ap- 
proximation to (1— pp)” where 7p is small and 
(1—>p)" can be regarded as the probability 
of surviving the (independent) onslaughts 
of n organisms each having probability p of 
killing the host. Estimation by maximum 
likelihood is illustrated and tables to 
facilitate the (iterative) solution are given. 
The model is compared with the probit 
model and it is concluded that it is a suit- 
able alternative for many problems. Effi- 
cient choice of dosage is shown to mean 
concentrating the doses in the 10%-35% 
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survivors range. Lincotn Mosss, Stanford 
University. 


Reid, A. T., “On stochastic processes in 
biology,” Biometrics, 9 (1953), 275-89. 

There are many diverse fields of enquiry 
in the domain of biology where processes or 
mechanisms can best be considered in 
probabilistic terms. Examples arise in 
genetics, epidemiology, spread of rumors, 
organization and function of the nervous 
system, migration of organisms, population 
growth, experimental carcinogenesis. This 
paper considers illustrative problems which 
have been dealt with as stochastic processes. 
Unsolved problems are pointed out. The 
bibliography covers a large amount of re- 
cent work in this field. Lincotn Moszs, 
Stanford University. 


Rosenbaum, S., “Tables for a Nonpara- 
metric Test of Location,” Annals of Mathe- 
matical Statistics, 25 (1954), 146-50. 

To test whether two samples of n points 
and m points come from the same popula- 
tion, one counts the number of points, s, in 
one sample which lie outside an extreme 
value of the other sample. 

Tables are constructed to show the prob- 
ability is less than 1% (or 5%) that s or 
more points of a sample of size m (50) lies 
outside an end point of a sample of size n 
(350), provided the samples have been 
drawn randomly from the same population 
irrespective of its distribution. The author 
used some formulas given earlier by S. 8. 
Wilks in a paper on tolerance limits. A. E. 
Sarwan, University of North Carolina. 


Roy, P. M., “A note on the unreduced 
balanced incomplete block designs,” Sank- 
hya, 13 (1953), 11-16. 

It is stated that an arrangement of or 
units of v varieties, r units of each variety, 
in b blocks of size k (k<v) is known as a 
“Balanced Incomplete Block Design”’ if (i) a 
variety appears only once in a block and if 
(ii) pairs of varieties each appear in A 
blocks. It is necessary that bk=vr, A(v—1) 
=7r(k—1), and b2». A reduced form occurs 
when b, r, \ have no common factor. The 
object of the paper was to investigate (i) 
what are the unreduced designs, (ii) their 
connection with finite geometries, (iii) their 
connection with theorems of the method of 
differences, and (iv) whether they are 
capable of presentation in resolvable and 
affine forms suitable for the recovery of 
inter-block information. The only possibly 
resolvable forms are shown to be those de- 
signs where v=2(t+1), b=(2t+1)¢+1), 
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r=2t+1, k=2, \=1. R. A. Brap.ey, Vir- 
ginia Polytechnic Institute. 


Sargan, J. D., “An approximate treatment 
of the properties of the correlogram and 
periodogram,” Journal of the Royal Statisti- 
cal Society, Series B, 15 (1953), 140-52. 

It is shown that the correlogram of an 
auto-regressive series can be regarded as 
though it were derived from an auto- 
regressive equation of double the order of 
the equation generating the original series. 
Some properties of the periodograms of 
time series generated by linear stochastic 
equations are investigated, and the results 
are applied to the study of Beveridge’s 
wheat price index. G. L. Epcerr, Virginia 
Polytechnic Institute. 


Sundrum, R. M., “The Power of Wilcoxon’s 
2-Sample Test,” Journal of the Royal Sta- 
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tistical Society, Series B, 15 (1953), 246-54. 


Mann and Whitney’s form of Wilcoxon’s 
2-sample test is described, and the variance 
of the statistic U under the null hypothesis 
is given. The upper bound of the variance of 
U is derived, and a population is con- 
structed in which this upper bound is at- 
tained. Using the assumption that the 
statistic U is normally distributed under 
both the null and the one sided alternative, 
the power of the test is found for both 
normally distributed and uniformly dis- 
tributed variates. The power of the test 
under the normal case is compared with the 
power of Student’s ¢, and the power under 
the rectangular alternative is compared 
with the power of a test based on differ- 
ences of mid-ranges and a test based on 
differences of sample means. H. C. SwEENnry, 
Virginia Polytechnic Institute. 





BOOK REVIEWS 


Design for Decision. Irwin D. J. Bross. New York: The Macmillan Co., 1953. 
Pp. viii, 276. $4.25. 


D. V. Linpiey, University of Cambridge 


“oH do YOU make decisions? Are you unwittingly letting emotion con- 
ceal essential facts? Are you sure of making the right interpretations 
of facts and figures?” “How to make decisions that PAY. You’ll learn new, 
more effective techniques for reaching the best decisions on questions of all 
kinds in—Design for Decision.” “The methods pay. Today a large University 
is paying a Ph.D. on its research staff more than its football coach—because 
of his decision-making skill.” “Be the man in demand—get a copy of... .” 
And so on, and so on. The quotations are taken from the book-jacket of, or 
advertisements for, the book under review. It is a pity that in making a 
serious attempt to write popular science Bross should have been so ill-served 
by the blurb writers, and one can only query whether they weren’t paid more 
than Bross for producing this rubbish. 

In fact Bross has tried to give an account, as far as possible in non- 
mathematical language, of some of the ideas of modern statistical decision 
theory and statistical methods. Presumably the book is intended for the in- 
telligent layman who wishes to have some idea of what these much-maligned 
statisticians do, and if so it succeeds reasonably well in giving a general 
impression, though on points of detail it falls very wide of the mark. The 
style is lively and, as far as I can judge, pure American. The blurb is wrong 
again in describing it as plain English. I found it most enjoyable. “[An] out- 
line of history, from Ooze to Oak Ridge,” “Statistical Inference. How to be 
a Great Detective in one easy lesson” are two delightful examples. Despite 
this style the author never makes the mistakes of the blurb writer. 

The book falls naturally into two parts. The first begins with a brief ac- 
count of the central role decision-making plays in man’s existence, discusses 
the nature of probability, and the need for a value system in making de- 
cisions, and concludes with suggested rules for making decisions and some 
examples of their use. The second part contains an account of some modern 
statistical ideas, including a detailed account of the basic ideas involved in 
testing hypotheses and a briefer mention of other statistical tools. A supple- 
ment gives suggestions for further reading. Probability to Bross is all embrac- 
ing; that is, we can speak of probabilities of hypotheses and use Bayes’s theo- 
rem, (Minimax, as understood by him, only includes maximization over a re- 
stricted class of a priori distributions, namely the reasonable ones.) Value 
systems are essential and pragmatism is the philosophy of life. When de- 
cision rules are introduced they are not presented in the usual way with, on 
the one hand, the “states of the world” and, on the other, the possible de- 
cisions, in the way they usually occur in statistical problems, but with the 
decisions contrasted with the possible outcomes. This would make the appli- 
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cation of the rules to statistical inference difficult if it were not for the fact 
that Bross never seems to realize that statistical decision theory has anything 
to do with, for example, testing the mean of a normal population when ths 
standard deviation is known! In fact the second half of the book is inde- 
pendent of the first half, apart from the discussion on probability. This is 
quite the most astonishing thing about the book. Whilst on p. 83 we are told 
that “the methods proposed by R. A. Fisher to replace Bayes rule also make 
assumptions about the [prior probabilities] ...and the substitutes for 
Bayes rule actually represent special cases of the rule,” we are later (Chap- 
ter 13) presented with a Fisherian argument for the test of normal mean 
based on significance levels without a mention of Bayes’s theorem! Similar 
remarks apply to the value system, except for some observations on a Simple 
Value System (p. 101) which are not explored. It is amazing that he should 
not realize that a logical pursuit of the decision function method (either with 
or without prior probabilities) leads to a rejection of the significance level 
approach: in general if weights are assigned without consideration of the 
sample size, as they usually would be, then the significance level (i.e. the 
probability of error) will vary with this size. Consequently the second half of 
the book is not merely independent of, but is in contradiction with, the first 
part. There is even more to it than that, for some of the paradoxes of stand- 
ard statistical methods are quoted without it being realized that they are 
resolved by Wald’s ideas. Thus (p. 209) “One crusader against the myth of 
normality has a standard offer of $100.00 for any collection of data with over 
one thousand observations which will meet the standard statistical tests of 
normality. So far as I know he has had no takers.” Decision theory—balancing 
one hypothesis against another—resolves this criticism of statistical tests. 

Nevertheless the second part of the book does contain a lively, reasonably 
accurate, and always interesting account of some statistical ideas. This is 
possible because, so far as one can see, practicing statisticians, that is people 
who handle data, not the mathematical manipulators of uninterpreted 
symbols, do not use decision theory ideas. Significance levels are still used 
despite Wald. To discuss why this is so would take us too far away from the 
book under review. Any statistician wishing to recommend this book to his 
friends should therefore warn them to take a good pinch of salt with the first 
part. Even the second half is not free from blemish for there is one grand 
howler. The test chosen by Bross as an example is that for the mean of a nor- 
mal population, standard deviation known. The description is lucid and it is 
a great pity that it is spoiled by the use of the sum of squares as the test cri- 
terion and reference to the x?-table for its significance! (p. 229). This appears 
to have arisen through a misunderstanding of the meaning of the term 
“sufficient statistic.” 

There are many other things one might comment on, for this is a stimu- 
lating book. Several remarks are sheer nonsense: “Our standard for bias is 
based on the rule that whenever an overwhelming majority of observers are 
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in agreement they are, ipso facto, right” (p. 152). Others are very sensible— 
those on the importance of a value system, for example. Other remarks are 
contradictory: “if this outcome or result of the decision process is agreeable 
to me, then the decision may be adjudged satisfactory” (p. 20), but “...a 
definite and objective way to progress from data to inference . . . all arrive at 
the same conclusion provided they start from the same data” (p. 221). (In 
both cases my italics.) What does, however, annoy me is Bross’s assurance. 
He states everything as though there were no doubt at all. Pragmatism is 
right. Probabilities of hypotheses are all right. He never gives a hint that 
other people might hold different views and yet be sound. The world is es- 
sentially simple to him: his blacks are jet, his whites are bright and there are 
no greys. I envy him. But “I beseech you, in the bowels of Christ, think it 
possible, you may be mistaken.” 

Here then is a book which gives overall a good account of statistical ideas 
for the layman, a less satisfactory account of decision theory, but which is 
always entertaining. It contains a magnificent misprint. In connection with 
the Renaissance, the renewed interest in Greek ideas, and the dawn of the 
experimental method, we read: “The same era that witnessed the rediscovery 
of Reason also saw the birth of the successor to Reason—Silence.” 


The Design and Analysis of Experiment. M. H. Quenouille. New York: Hafner 
Publishing Co., 1953. Pp. xiii, 356. $7.50. 


BERNARD OsTLE, Montana State College 


_ appearance of four books on experimental design in the last few years 
is evidence that authors have become aware of a gap which has existed in 
the literature for too long. The four books are: Experimental Designs by 
Cochran and Cox, Design and Analysis of Experiments by Kempthorne 
Analysis and Design of Experiments by Mann, and the volume reviewed here. 
The four authors had different goals. Mann’s text is a purely mathematical 
approach to general linear hypothesis theory. Cochran and Cox give a catalog 
of useful designs, accompanied by considerable explanatory material which 
makes their book most helpful to research workers. Kempthorne’s work is 
conceived on a larger scale: he successfully attempts a logical development 
of designs (mainly factorials) and design principles, and thus has produced 
a book of an advanced nature more suitable for graduate study than for re- 
search use. Quenouille’s book is most nearly akin to that by Cochran and 
Cox, but gives greater emphasis to groups of experiments and long-term 
policy. 

The Design and Analysis of Experiment has been divided into four sections 
which, with the topics included in each, are: (A) Elementary Principles and 
Designs: (1) The design and analysis of experiments, (2) Randomised blocks 
and Latin squares, (3) Factorial and split-plot designs; (B) Incomplete 
Block Designs: (1) Factorial designs involving factors at two or three levels, 
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(2) Complex factorial designs, (3) Incomplete block designs for a single set 
of treatments; (C) Long-Term Policy: (1) Long-term experiments, (2) 
Planning of groups of experiments, (3) Combinations of experimental re- 
sults; (D) Experimental Complications: (1) Special designs and analyses, 
(2) Missing observations, (3) Scaling of observations. 

In the preface, the author states: “This book is aimed at those wishing to 
acquire a working knowledge of experimental design and an understanding 
of the principles governing it.” Unfortunately, the explanations are often too 
brief and too technical to be of great value to research workers not well- 
versed in statistical theory and principles. In particular, this is not a book for 
persons lacking a solid background in analysis of variance. The constant 
stress on the assumptions necessary for a valid use of the various designs is, 
however, an excellent feature. Too often this important part of experimental 
planning is omitted from classroom lectures and text book chapters. 

In section B, Quenouille describes the main types of confounding. The use 
of partial confounding is discouraged because of complexities of computation. 
While it is true that partial confounding does add to the time involved in 
calculation, this extra effort is frequently worthwhile. This apparent con- 
demnation of partial confounding is characteristic of Quenouille’s tendency 
to use sweeping statements which may lead to the non-critical reader to 
lose the benefits of certain advanced technqiues. 

The most interesting material in this book is contained in Sections C and 
D, especially under the heading Long-Term Policy. The discussion on plan- 
ning of groups of experiments and combining experimental results is excellent. 
On the other hand, the reviewer wonders why “missing observations” were 
singled out for special attention. This subject could easily have been handled 
as @ necessary consideration when discussing each particular design. Also, 
rotation experiments surely deserve more attention than the four pages de- 
voted to them. 

In summary, much useful information is contained in the book. However, 
the tendency to be brief and technical, accompanied by an occasional careless- 
ness in writing (due, no doubt, to attempting to complete the work within 
too short a period of time), has detracted from the value of the book. It is 
not to be read quickly. Nevertheless, experienced research workers in the 
agricultural and biological sciences will find it a good reference text if read 
critically. 


Sample Survey Methods and Theory. Vol. I. Methods and Applications; 
Vol. II. Theory. Morris H. Hansen, William N. Hurwitz and William G. Madow. 
New York: John Wiley & Sons, Inc., 1953. Pp. xxii, 638; xiii, 332. $8.00; $7.00. 


Tore Datentvs, Stockholm 


sparen. The campaigns conducted during the first quarter of this 
century for an increased use of “the representative method,” supported 
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by the International Statistical Institute among others, showed special suc- 
cess in the latter part of the 1930’s. The lead in the new era of sampling was 
taken by India and the United Kingdom, where the development naturally 
was knitted to improvements in the field of agricultural statistics, and by the 
United States, where a great portion of the development was knitted to im- 
provements of methods for measuring socio-economic phenomena. 

The developments of 1939 accelerated this trend. In the U. 8., the Bureau 
of the Census took the lead. The Census Bureau introduced sampling meth- 
ods into the 1940 census to an extent not previously seen, and transferred 
many sample surveys carried out by means of non-probability methods to a 
probability basis. This development was responsible for the creation of a 
large and competent “sampling staff” within the Census Bureau. Among the 
members of this staff were Morris H. Hansen, William N. Hurwitz and Wil- 
liam G. Madow, authors of the two volume work Sample Survey Methods and 
Theory, published as one of the Wiley Publications in Statistics. 

A large amount of the work carried out by Hansen-Hurwitz-Madow and 
their colleagues necessarily meant application of already available theory to 
actual survey operations. But a considerable portion of the activity was 
devoted to the development of new theory and new methods. 

Portions of the results thus achieved have been presented earlier; examples 
are the book A Chapter in Population Sampling, the almost classical 1943 
paper entitled “On the Theory of Sampling from Finite Populations” and 
the 1949 paper “On the Determination of Optimum Probabilities in Sam- 
pling.” 

Objects of the book. Sample Survey Methods and Theory represents “an at- 
tempt to give a comprehensive presentation of both sampling theory and 
practice.” The book as a whole, as well as each one of the two separate vol- 
umes, is designed as a textbook; it should, moreover, serve as a manual for 
the investigator engaged in the design of sample surveys. Finally, parts of 
the book are intended for “the user of the results of surveys who wishes to 
know the circumstances under which he may place confidence in information 
based on samples.” 

Broad summary vj content. As indicated by the subtitles of the two volumes, 
Volume I is devoted to applications (it is labelled, by Wiley, “applied sta- 
tistics”); Volume II is devoted to theory (labelled, by Wiley, “mathematical 
statistics”). 

Volume I may be looked upon as made up by three parts; the introduction 
and chapters 1-3 make up the first part, chapters 4-11 the second part, and 
chapter 12 the third part. 

The introduction and chapters 1-3 address themselves to the consumer of 
survey results rather than to the producer. The language is “nonmathema- 
tical”; this does not mean a complete lack of formulas and symbols but a 
frequent use of easily-grasped illustrations. In addition to presenting the 
usual sampling principles, the first part presents the philosophy of the use of 
measurable sample survey designs. 
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Chapters 4-11 present a detailed account of methods for sampling from 
finite populations. Most results are given with references to proofs in Volume 
II. The following list of chapter headings gives an idea of the contents: 4, 
Simple random sampling; 5, Stratified random sampling; 6, Simple one- or 
two-stage cluster sampling; 7, Stratified single- or multi-stage cluster sam- 
pling; 8, Control of variation in size of cluster in estimating totals, averages, 
or ratios; 9, Multi-stage sampling with large primary sampling units; 10, 
Estimating variances; 11, Regression estimates, double sampling, sampling 
for time series, systematic sampling, and other sampling methods. 

Chapter 12, “Case Studies,” constitutes the third part. In the first half of 
this 110 page chapter, three sample surveys from the practice of the Census 
Bureau are presented in detail in a way which demonstrates how the many 
methods presented in chapters 4-11 are integrated into a survey design. The 
rest of chapter 12 deals with two studies of variances and co-variances and 
the use of quality control methods in the office processing of the 1950 
Censuses of Population, Housing, and Agriculture. 

Likewise, Volume II may be looked upon as made up of three parts; 
chapters 1-3 make up the first part, chapters 4-11 the second part and chap- 
ter 12 the third part. In chapter 1, the fundamental definitions used in Vol- 
ume I are summarized; by this device, used throughout Volume II, this 
volume is made self-contained; it is possible to read Volume II before Volume 
I, or even to read only Volume II. Chapters 2-3 present the fundamental 
theorems on probability, expected values and variances necessary for master- 
ing the proofs given in the following chapters. 

Chapters 4-11 are mainly devoted to proofs of the results quoted in Vol- 
ume I. The presentation runs, chapter by chapter, parallel in the two vol- 
umes. However, in addition to the empirical results presented in Volume I, 
there are new ones scattered in these chapters. 

Chapter 12, finally, presents a theory of response errors; the chapter is a 
revision of a paper published in this Journal in 1951. 

An effort to evaluate the book. I am of the opinion that a review, in order to 
be comprehensive, should end up with an effort to evaluate the book against 
the background of the objectives set forth by the authors. 

The book is a “comprehensive presentation of both sampling theory and 
practice.” As to theory (for sampling from finite populations), I am at a loss 
to find anything of importance that has been left out; as to practice, it is 
true that most applications are selected from the work within the Census 
Bureau but there seems to be no important type of finite population en- 
tirely left out. 

I have not had experience with this book as a textbook; I have had, how- 
ever, the opportunity of attending a course, in Washington in 1951, which 
was based on notes having a story in common with this book. From this ex- 
perience, I conclude that the book will be found to be an excellent textbook. 

There is a specific feature of the book which deserves to be mentioned. By 
splitting the book into two volumes, one “applied” and one “theoretical,” 
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the authors have been able to present the solution of a design problem in 
different ways in the two volumes. In Volume I, the authors stress just as 
much the region (interval or whatever it may be) within which the exact 
mathematical solution is to be found, as the exact solution itself; thus, the 
authors stress that “the optimum is broad” when discussing, for example, 
optimum allocation in stratified sampling and optimum size of cluster. 
Solutions of this kind are sorely needed in actual work where one almost 
always has to design without exact information as to the size of important 
“design parameters” (such as variances). In volume II, on the other hand, 
emphasis is on the exact solutions. Volume II thus teaches the technique to 
use on one’s own problems. 

Space does not permit a detailed discussion of other valuable aspects of 
this book. I only want to mention that I welcome especially the thorough 
analysis of survey costs and the construction of cost functions. There are, 
in official survey reports from all over the world, many examples of cost 
functions; but these examples are often difficult to interpret and use as long 
as there are only, at the very best, indications of what kind of costs are taken 
eare of by the different components in the cost functions. 

In summary, this is a great book, which will be indispensable to every per- 
son, statistician or not, who comes close to sample surveys. Of course, in a 
book of nearly 1,000 pages, one can find opportunities to criticize some 
points. Most of them are too minor to be dealt with at length (e.g., the def- 
inition of simple random sampling, chapter 4, does not seem to fit the one 
in chapter 5; I would prefer to see a distinction between a “ratio estimate” 
and an “estimate of a ratio,” and so on). But it is perhaps justifiable to say 
that chapter 11 is somewhat displaced and too “mixed.” Personally, I would 
rather see one separate chapter devoted to systematic sampling, possibly 
placed before the present chapter 6. The difference and regression estimates 
could be discussed in exactly the same way as are the ratio estimates (i.e., 
in conjunction with the specific sampling systems, as means of improving pre- 
cision over and above that of simple estimates such as the sample mean). 
[Note: Since chapters numbers and content are parallel in the two volumes, 
the foregoing reference to chapters are for both volumes.] 

The language in the book is, even for a foreigner, easy; one soon gets used 
to words like “rel-variance,” “epsem,” etc. But the symbols are part of the 
language. The lack of a standard in this area is in itself regrettable and should, 
I think, be taken care of. As a result of this lack, Wiley has published in 
1953 two excellent books in sampling, this and Cochran’s, which use rather 
different symbols. 

I hate to finish my review of this excellent book by being critical, so I 
repeat: Sampling Survey Methods and Theory is a great book, which will be 
indispensable to every person, statistician or not, who comes close to sample 
surveys. 
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Elementary Statistical Analysis. Harry P. Hartkemeier. Dubuque, Iowa: Wm. 
C. Brown Co., 1952. Pp. xxii, 484. $6.00. Paper. 


Acueson J. Duncan, The Johns Hopkins University 


HIs book by Professor Hartkemeier is the text that is used in the fresh- 

man course in elementary statistics at the University of Missouri at which 
he is Director of the Statistics Laboratory. It assumes no prerequisite of 
college mathematics and is slanted toward the student of business. In the 
author’s own words, it “has been written for the person who likes to have 
directions for the immediate practical application of the elementary statis- 
tical techniques to sample data without waiting until all statistical techniques 
and the mathematical theory underlying them have been explained in de- 
tail.” 

In his twenty odd years of teaching statistics Professor Hartkemeier has 
developed many novel ideas as to how a beginning course in the subject 
should be taught and has as a consequence written a book that is strikingly 
different from other statistics texts. This is true with respect to both format 
and contents. The pages are full size reproductions of typewritten copy, the 
cover is heavy paper and the whole is fastened together with plastic screws 
that allow withdrawal of pages at will. This loose leaf character of the text 
permits the author to include special problem forms and work sheets that 
upon completion may be extracted by the student for submission to the 
instructor. It also permits extraction of tables at time of examination without 
running the danger of recourse to forbidden sections of the text. Illustrations, 
tables, and problem forms are placed at the ends of the chapters thereby 
permitting ready reference. 

The book is concerned primarily with tests of significance for small sam- 
ples. After an “introductory” chapter in which the reader is given the ele- 
ments of frequency distributions, time series analysis, index numbers and 
correlation (all in 35 pages), the author settles down to a detailed discussion 
of the computation of square roots and use of mathematical tables, the com- 
putation and use of the arithmetic mean and standard deviation and other 
types of averages and representative values, tests of significance for means 
and standard deviations and comparisons of two means and standard devia- 
tions, contingency tables and chi-square tests, and a discussion of analysis of 
variance that includes problems related to both single and two way classi- 
fications, unequal numbers in different cells, Latin squares, and data with 
several sources of random error. There is also a chapter on statistical quality 
control in manufacturing operations, and a chapter on computing procedures 
and machines, drawn heavily from the author’s book Punch-Card Methods 
and illustrated copiously with pictures (supplemented with directions for 
use) of many of the modern computing machines, including even the Monroe 
Octal Adding-Calculator. All this material is presented in a very readable 
style that the reviewer believes beginning students will like. At the end of 
each chapter there are numerous problems and considerable problem data. 
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In writing a beginning book that attempts to explain the principles of 
statistical theory without the use of mathematics, the greatest difficulty is 
to prevent the argument from becoming inexact and perhaps incorrect. 
Although Professor Hartkemeier does a splendid job on the whole of explain- 
ing difficult material, he has not succeeded entirely in avoiding misleading 
statements and errors. The principal instances of this kind noted by the 
reviewer are as follows: 

(1) Charts of frequency curves in the book have a vertical scale marked 
“number of” or “relative number of” cases, whereas in fact it is the area 
under the curve that is a measure of the “relative number of” cases and not 
the ordinate of the curve. 

(2) The text suggests that the mean has little use in a highly skewed dis- 
tribution. There are cases, however, in which it may be the “best” average 
for certain purposes. If we know the mean family income, for example, and 
the number of families in a given community, we can compute the com- 
munity income. We cannot do this with the mode or the median. 

(3) The text tends to give a false conception regarding the character and 
use of the ¢-distribution. The impression is given that the ratio of a variable 
to its standard error follows the ¢-distribution if the sample from which the 
variable is calculated is small. Thus, on p. 372, the ratio of a mean of 4 
sample of 25 to a known standard error is treated as a t variable. Actually it 
is a normal variable. Again, on p. 373, the ratio (o sample —@ universe) /known 
standard error of o is treated as if it had the ¢-distribution (apparently be- 
cause the sample size is 25) whereas, it actually is distributed as x or more 
precisely as ./2(x— /N). 

The facts about the ¢-distribution are as follows: If z isa normally distrib- 
uted variable with zero mean and unit standard deviation, if u is a variable 
that follows the x? distribution with n degrees of freedom, and if z and u 
are independently distributed, then the ratio z/./u/n has the t distribution 
with n degrees of freedom. The ¢ distribution approaches the normal distri- 
bution as the degrees of freedom become infinitely large. 

(4) The text fails to point out (p. 334) that the use of the ordinary x? 
table of probabilities in x*-tests of frequencies involves an approximation. 
This is essentially the same in character as the approximation of a binomial 
probability by an area under a normal curve. It is the reason for the x? 
correction for continuity in a 2X2 contingency table when samples are not 
large. 

(5) The title of Chapter 9, “Statistical Methods Necessary for Quality 
Control in Manufacturing Operations,” is misleading. A better one would 
have been “The Author’s Ideas on Some Quality Control Procedures.” 
Although Professor Hartkemeier refers to several books on quality control, 
including one by the reviewer, he draws little from them. Instead he pro- 
ceeds to describe methods that differ widely from the standard procedures. 
Thus, his X-chart uses samples of 25, not the customary samples of 4 or 5. 
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The central line on the chart is apparently based on specifications, in the 
manner of a modified X-chart. The lower limits on the chart are 2.5% and 
0.5% probability limits incorrectly based on the ¢-distribution, not the ordi- 
nary 3¢ limit. The upper limits are statistical limits but are confusedly inter- 
preted as being related to some specification based on the desire not to pro- 
duce too good a product. The discussion emphasizes the use of the X-chart 
to maintain a constant allowable fraction defective, but says little about the 
use of the chart in discovering assignable causes (the principal use of the 
control chart). 

The chart that is suggested for controlling variability is the little-used 
standard deviation chart. Again it is related to the attempt to maintain a 
constant fraction defective rather than to the attempt to detect assignable 
causes. Strangely (and incorrectly), limits are again set by a ¢ factor. No 
mention is made of the range chart, although this is used in industry many 
more times than the standard deviation chart. The statement made earlier 
in the text (pp. 117, 212) that the range is not used much by statisticians is 
very much in error in the industrial field. 

In any course in which this book is used, the reviewer would strongly urge 
the omission of this chapter on statistical quality control. Fortunately, such 
an omission can be made without difficulty. 

(6) The text does not state the basic assumptions of analysis of variance 
(additivity of main effects and normality, independence, and uniform vari- 
ability of the random variable). This omission is somewhat unfortunate in 
that the text is free and easy in its illustrations of the uses of analysis of vari- 
ance. It is applied to percentage data (pp. 441, 451) where the assumption of 
uniform variability is probably violated; and it is applied to sales data (p. 
443) where the assumption of additivity of main effects is questionable. It 
may be admitted that the F-test is “robust” and may not be seriously af- 
fected by such deviations from strictly valid procedures. The use of such illus- 
trations, however, coupled with failure to state the assumptions underlying 
strictly valid procedures may lead the reader to believe that there are no 
limitations to the application of analysis of variance techniques. 

More serious than these deviations from the basic assumptions of analysis 
of variance is the tendency to play around with the data until some signifi- 
cant conclusion is reached with little regard to the effect of this procedure 
upon the final level of significance. Thus, t-tests are run after an F-test shows 
non-significance (p. 398), and data are reclassified (p. 399) and tested again 
after a first F-test shows non-significance. 

The use of charts to show interactions is an excellent device, but to inter- 
pret an interaction as the “crossing” of the movements at various levels 
(p. 409) is not accurate. The criterion of interaction is non-similarity of 
movement, not crossing of the interaction graphs. 

The computation of a row sum of squares (p. 394) when there are actually 
not rows but merely an equal number of cases in each class is very confusing 
and should be omitted. 
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These criticisms should not deter an instructor well trained in statistics 
from using the text for a beginning course for, as noted above, it will probably 
be liked by the students and this is important. A greater drawback in the 
eyes of the reviewer is the little attention given in the book to confidence 
intervals, errors of the second kind, and correiation. 


Mathematics and Statistics for Economists. Gerhard Tininer. New York: Rine- 
hart and Company, Inc., 1953. Pp. xiv, 363. $6.50. 


G. Batey Price, University of Kansas 


HE standard mathematics curriculum in American colleges and universi- 

ties is one which has grown up in connection with the physical sciences— 
one which has been designed to support the study of chemistry, physics, 
and the engineering sciences. The traditional sequence of courses—college 
algebra, trigonometry, analytic geometry, calculus, and differential equa- 
tions—is badly out of date because it has undergone no fundamental revision 
in fifty years, and perhaps a hundred. 

There is abundant evidence that major changes are in progress. The 
Mathematical Association of America and the National Council of Teachers 
of Mathematics sponsored a Conference on Teacher Training at the Uni- 
versity of Wisconsin in the summer of 1952. As an outgrowth of this confer- 
ence and of the activities of the National Research Council’s Committee on 
the Regional Development of Mathematics, two committees have been ap- 
pointed to study the revision of the undergraduate curriculum. One is a joint 
committee of the MAA and the NCTM under the chairmanship of Dr. C. V. 
Newsom, and the other is a committee of the MAA under the chairmanship 
of Professor W. L. Duren, Jr. The National Science Foundation sponsored 
a Summer Conference on Collegiate Mathematics at the University of Colo- 
rado in the summer of 1953, and it will sponsor two similar conferences, at 
the University of Oregon and the University of North Carolina, in the sum- 
mer of 1954. The National Science Foundation will also sponsor a conference 
for high school teachers of mathematics in the summer of 1954. All of these 
activities are designed to modernize the mathematics curriculum and its 
teachers. 

Professor Tintner’s book on Mathematics and Statistics for Economists 
must be considered further evidence of the change that is taking place in 
mathematics, especially in its relation to the social and biological sciences. 
This book was written to teach mathematics and statistics to economists, 
especially to future econometricians. It was written for students who know 
economics, and who have some knowledge of algebra and trigonometry. 

The book is divided into three parts. Part I covers pages 3 to 65. The chap- 
ter headings in this part are: functions and graphs; linear equations in one 
unknown; systems of linear equations; quadratic equations in one unknown; 
logarithms; progressions; determinants; and linear difference equations 
with constant coefficients. It might be supposed from these chapter headings 
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that Part I is a brief college algebra, but this is not the case: it is entitled 
Some Applications of Elementary Mathematics to Economics. In the first 
place, the treatment of algebra is far less extensive than the chapter titles 
would suggest. The chapter on the quadratic equation in one unknown con- 
tains a half page of discussion, in which the formula for the roots of the quad- 
ratic is stated, and a half page of exercises. The chapter on logarithms con- 
tains no treatment of the properties of logarithms and their applications to 
computation. Instead, a knowledge of logarithms is assumed, and applica- 
tions are made to two problems in economics. The chapter on determinants 
is marked as one that is not needed for the remainder of the book and can be 
omitted. On the other hand, a chapter on linear difference equations is in- 
cluded, and it is not marked as one that can be omitted. Thus the algebra 
content of Part I is far less than that of the typical course in college algebra. 
In the second place, Part I contains a large amount of economics. Indeed, 
it contains a treatment of the following topics from economics: linear pro- 
gramming; linear-supply functions; linear-demand functions; market equi- 
librium; market equilibrium for several commodities; imputation; quadratic 
demand and supply curves; Pareto distribution of income; demand curves 
with constant elasticity; growth of enterprise; population theory of Malthus; 
and compound interest. 

Part II, entitled Calculus, covers pages 69 to 190 and contains a fairly ex- 
tensive treatment of differential calculus. One chapter of 13 pages is devoted 
to a treatment of integral calculus. This chapter states the fundamental 
theorem of calculus, but it does not suggest a proof. The extent of the treat- 
ment of calculus is well indicated by the chapter headings in Part II: func- 
tions, limits, and derivatives; rules of differentiation; derivatives of logarith- 
mic and exponential functions; economic applications of the derivative; 
additional applications of derivatives; higher derivatives; maxima and mini- 
ma in one variable, inflection points; derivatives of functions of several 
variables; homogeneity; higher partial derivatives and applications; ele- 
ments of integration. It may be remarked that the treatment of calculus 
given here is far less extensive than that contained in the standard calculus 
courses normally given in the freshman and sophomore years to physical 
science students. Part II also treats the following topics in economics: de- 
mand functions and total revenue functions; total and average-cost func- 
tions; marginal cost; marginal revenue; elasticity; elasticity of demand; 
marginal revenue and elasticity of demand; increasing and decreasing mar- 
ginal costs; monopoly; average and marginal cost; marginal productivity; 
partial elasticities of demand; joint production; utility theory; production 
under free competition; marginal cost, total cost, average cost; and consum- 
er’s surplus. 

Part III, entitled Probability and Statistics, fills pages 193 to 309 and con- 
tains a brief but significant treatment of probability and statistics. The chap- 
ter headings are the following: probability; random variables; moments; 
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binomial and normal distributions; elements of sampling; tests of hypotheses; 
fitting of distributions; regression and correlation; index numbers; and a 
postscript which contains suggestions for further reading. In contrast with 
the first two parts of the book, economic theory is conspicuous by its absence 
from Part III. To be sure, the fitting of demand and supply curves is treated 
on pages 292 to 297, but in general this part of the book appears to be a 
rather straightforward treatment of statistics. 

Pages 311 to 340 contain answers to the odd-numbered exercises. The next 
section of the book consists of six tables as follows: four place common 
logarithms (pp. 341-342); natural trigonometric functions (pp. 343-346); 
four place natural logarithms (pp. 347-348); area of the normal probability 
curve (p. 349); Student’s ¢-distribution (p. 350); and the x? probability scale 
(p. 351). The book ends with an index of names (p. 355); an index of mathe- 
matical and statistical terms (pp. 357-360); and an index of economic terms 
(pp. 361-363). 

Many departments of mathematics now recognize a reponsibility to teach 
mathematics for the social and biological sciences as well as for the physical 
sciences. Professor Tintner’s book emphasizes a number of the problems that 
face these departments in their efforts to discharge their new responsibilities. 
The typical instructor in mathematics will feel that he is not competent to 
teach Mathematics and Statistics for Economists because his knowledge of 
economics and statistics is inadequate. A major problem in the introduction 
of a new undergraduate curriculum will be the training of the present 
mathematics staffs to teach the new curriculum. The fact that Professor 
Tintner holds the unusual title of Professor of Economics, Mathematics, and 
Statistics emphasizes that he is not a typical staff member. A major problem 
concerns the organization and arrangement of a curriculum designed to serve 
the needs of all those fields which now make significant use of mathematics. 
Must departments of mathematics now offer separate courses for physicists 
and chemists, for engineers, for economists, for psychologists, for biologists, 
and so on? The small liberal arts colleges, in which so many of our scientists 
and other scholars originate, will find such an arrangement impossible be- 
cause of their limited staffs and the small number of their students. Many 
university educators and administrators will oppose such specialized courses 
on a variety of grounds. But is it possible to devise a freshman course in 
mathematics that the entire university will find adequate and acceptable? 
The question is more easily asked than answered. Finally, Professor Tintner’s 
book emphasizes that the needs of the economists will not be satisfied by a 
course in statistics. It appears significant that, as pointed out above, the 
concepts and principles of economics occur in Parts I and II rather than in 
connection with statistics in Part IIT. 

Whatever the ultimate solution of the problems involved in teaching 
mathematics to economists, both the mathematicians and the economists are 
indebted to Professor Tintner. He has written what appears to be a teachable 
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textbook in a new field. In particular, he has provided an enormous collection 
of significant and vital exercises in economics which involve the elementary 
parts of mathematics. The extent of this collection of exercises is indicated 
by the fact that, as noted above, the answers to the odd-numbered exercises 
alone fill 30 pages of the book. One of the major problems that confronts the 
mathematicians in their efforts to revise their elementary curriculum is the 
collection of similar exercise material which will relate the old and new 
mathematics to the various fields and subjects where it finds application. 


Hood, William C., and Koopmans, Tjalling C., editors. Studies in Econometric 
Method. Cowles Commission Monograph No. 14. New York: John Wiley and 
Sons, 1953. Pp. xix, 323. $5.50. 


Kennet J. Arrow, Stanford University 


p we collection of ten studies of problems in the estimation of simultaneous 
structural equations is indispensable for the modern econometrician. It 
gives a virtually complete picture of the present state of the subject and at 
the same time is eminently readable. Of the papers, three have been pre- 
viously published, the remainder being specially written for this volume. 

The present volume follows a pattern already set in earlier collective works 
published by the Cowles Commission; there is one long paper which sets forth 
systematically the basic ideas of the subject, while the remaining papers pre- 
sent more detailed expositions or further developments. In this case, the 
central paper in Chapter VI, by Koopmans and Hood, an admirably clear 
exposition of the model of linear simultaneous stochastic difference equa- 
tions, the definition of exogenous, endogenous, and predetermined variables, 
the criteria for identification in this model, the maximum-likelihood method 
of estimation with particular reference to the limited-information case, and 
some statistical tests of the validity and the identifying power of a priori 
restrictions. The derivations are new and very much simplified from earlier 
versions, though perhaps one would hardly call them simple in an absolute 
sense. Though the article is not written in textbook form, it will form essen- 
tial supplementary reading to a book such as Klein’s, if it is desired to supply 
the student with a derivation of the basic estimation formulas.! 

The first paper, by Jacob Marschak, is an excellent exposition of the role 
of statistical inference in economic policy and prediction. The concepts 
studied by Koopmans and Hood are here introduced in relation to the uses 
for which they are intended. The second paper, by Koopmans, is a non- 
technical exposition of the concept and problems of identification; it is re- 
produced, with minor revisions, from a paper published in Econometrica. 
The careful discussion of various examples will be invaluable pedagogically. 
In the third paper, Herbert A. Simon seeks, on the basis of the concepts of 
exogenous and endogenous variables, to define the notion of causality in a 
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way which will meet positivist objections, such as those of Hume. He argues 
that the complete rejection of the concept of causality (as opposed to func- 
tional interrelationship), as in Russell’s position (to which, however, Simon 
does not refer), does not correspond to the intuitive practice of scientists. 
An interesting discussion is then given of causal ordering of variables in a 
linear structure; the concept is closely related to that of identifiability. 

The fourth and fifth papers, by Trygve Haavelmo, and by M. A. Girshick 
and Haavelmo, respectively, are now famous empirical applications of the 
simultaneous equations method to the consumption function and to the 
demand for food. 

The seventh paper, by Herman Chernoff and Herman Rubin, shows, 
without proofs, how limited-information estimates may be used even when 
the conditions under which they were derived are not valid, in the sense that 
they give rise to consistent estimates. From the point of view of new knowl- 
edge, this is undoubtedly the most important paper of the volume. It is 
remarked that if predetermined variables in the system but not in the group 
of equations to be estimated are omitted, the estimates resulting will still be 
consistent if the variables omitted are not needed for identification. It is also 
shown that in many cases errors in the variables and non-linearities in the 
equations can be accommodated. 

In the eighth paper, Stephen G. Allen studies, in an example, the loss of 
efficiency by omitting a predetermined variable in a particular equation to 
be estimated. In the ninth paper, Jean Bronfenbrenner (Crockett) examines 
the bias attributable to the use of the method of least squares in a two- 
equation model. Both papers are very illuminating in giving a more intuitive 
appreciation of the sense in which the simultaneous equations methods is 
optimal. 

The last paper, by Herman Chernoff and Nathan Divinsky, is an extremely 
complete exposition of the computational methods used in various types of 
maximum-likelihood estimates. The practicing econometrician will make ex- 
tensive use of this section. 

This is a very useful collection of papers, which I can strongly recommend. 


Stochastic Processes. J. L. Doob. New York: John Wiley and Sons, 1953. Pp. 
vii, 654. $10.00. 


P. A. P. Moran, Australian National University, Canberra 


HE average statistician or worker in applied probability theory never has 

to deal with more than a finite number of random variables and thus 
never really needs to know much about measure theory. But the mathema- 
tician who wishes to found probability theory on a rigorous basis needs more 
elaborate theory. The strong law of large numbers (which is essentially an 
empirically unverifiable theorem) necessarily involves a theory of measure 
in a space with an enumerable infinity of dimensions while the theory of con- 
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tinuous random processes requires, for a fully rigorous foundation, a deep 
discussion of measure theory. In the present very remarkable book Professor 
Doob sets out a fully rigorous foundation for the theory of random processes 
both with discrete and with continuous parameters and in the course of doing 
this discusses a number of theorems of interest to those working on other 
parts of pure probability theory. The result is the most complete discussion 
yet published in book form of the foundations of the theory of processes. 

The book opens with a chapter on probability theory which is openly and 
frankly (and in the opinion of the reviewer, rightly) equated with the theory 
of measure. The elementary ideas of random variables, families of variables, 
and modes of convergences for random variables are very carefully intro- 
duced and then follows the most important part of this chapter—a discussion 
of conditional probabilities. This is usually skirted around in textbooks since 
a rigorous treatment requires considerable care, and was first given by 
Kolmogoroff. Then, after some standard results on characteristic functions 
there are some very interesting new inequalities between the tails of a dis- 
tribution and integrals involving its characteristic function. 

In the second chapter the author introduces the idea of stochastic process 
and considers the classification of such processes in general. For a process in 
which “time” (here always represented by a variable on a linear set) is con- 
tinuous we immediately get into difficulties in setting up a probability meas- 
ure in the class of all realizations of such a process. These are overcome by 
insisting that the processes be “separable” in a certain sense introduced by 
the author. This is probably the most difficult part of the book to the average 
reader and requires a knowledge of measure theory beyond that of most 
analysts. A useful supplement at the end of the book attempts to bridge the 
gap. Next, the author introduces Gaussian processes and gives a discussion 
of the Markovian property which is rather more careful than is usual. 

The next chapter on processes with mutually independent variables is 
concerned with classical results in the theory of probability, most of which 
will be more or less familiar to the reader. These are not discussed for their 
own sake but for the light they throw on random processes. The Borel- 
Cantelli lemma and similar results on series of variates and the law of large 
numbers are discussed with great care. Next we have a welcome account of 
infinitely divisible distributions and Lévy’s general formula for their char- 
acteristic functions. The arithmetic of distributions is not, however, con- 
sidered further than the needs of the theory of processes requires. A general 
account of this subject in English is much to be desired but remains to be 
written. 

Chapter IV considers processes with mutually uncorrelated but not 
necessarily independent variables. This is thus more general and the interest 
lies in seeing how far we can get with the weaker assumption. 

Chapter V on Markov chains with a discrete parameter deals with rela- 
tively familiar theory. The usual theory of a finite Markov chain with sta- 
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tionary transition probabilities, and the classification of the states, is given 
together with a short application to card mixing. Multiple Markov chains 
in which the transition probabilities depend not merely on the previous 
states but also on earlier states are reduced in the obvious way to ordinary 
Markov chains. This, however, is an illustration of the way in which the 
author deals only with the general theory of the subject (difficult as that is) 
without dealing with the analytical difficulties which arise as soon as we 
specialize the theory to some particular problem. Complex Markov chains 
(as Markov originally called them) are very awkward to deai with in prac- 
tice. Next we have a long and rather difficult account of the generalization 
(mainly due to Doeblin) of the previous theory to general state spaces and 
the corresponding law of large numbers and a central limit theorem. 

Next we have a chapter on Markov processes with a continuous parameter, 
firstly those with a finite number of states with the usual theory of forward 
and backward differential equations and then to a continuous state space 
and chains with an enumerably infinite set of states, the latter being nowa- 
days of very great importance in such subjects as the theory of queueing. 
The Fokker-Planck and related equations are then considered at some length. 

Chapter VII is a long (100 pages) and interesting chapter on what are now 
known as martingales. This word, which is due in this connection to J. Ville, 
is usually used in connection with the harness of horses and the rigging of 
ships but is also used, for some odd reason, for a gambling system in which 
the stake is increased by a factor of two at each trial. In probability theory 
a martingale is defined as a random process {z;} such that E{|z,| } < and 


1, = E {21,44| 24) alnibete Zt, } 


with unit probability, whenever t; < -- + <tny:, and is an arbitrary positive 
integer. This watery looking object does not look at first sight as if it would 
have a very interesting theory, but the theorems and results which follow are 
most interesting and varied and introduce a new unity into a widely scat- 
tered series of problems. Much of the work described is due to the author. 

He first considers applications to games of chance where the idea of 
martingale is used to provide a definition of a “fair” game. This leads to a 
new and interesting discussion of the effect on fair games of systems of op- 
tional stopping and sampling. Next follow some new inequalities for expec- 
tations in martingales and a series of convergence theorems. The general 
theory is then applied to sums of independent variables, to the strong law of 
large numbers, the theory of derivatives, and the relationship to the likeli- 
hood ratio and sequential analysis pointed out. The corresponding theory of 
martingales with a continuous parameter is then developed at some length 
together with some remarks on the application to Poisson and Brownian 
processes. 

In the eighth chapter processes with independent increments, which form 
a logical prelude to the study of stationary processes, are considered. 
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Examples of these are provided by the elementary Poisson process and the 
Brownian movement. The centering of the general process of this type and 
the classical theory of the character of the distribution function (due to its 
infinite divisibility) are discussed. Cramér’s classical theorem on the sum of 
two independent random variables (for which an ‘elementary’ proof is 
much to be desired) is mentioned but not proved or used in this treatment. 

In the following chapter the previous theory is generalized to processes 
whose increments are only orthogonal instead of independent, and this is 
combined with a detailed discussion of stochastic integrals. An interesting 
application mentioned is that to Campbell’s theorem. However, the applica- 
tion of this theorem to electrical noise is not developed and Campbell’s name 
is not to be found in the bibliography. An interpretation of the idea of a 
Fourier transform of an actual realization of a process is followed by a gen- 
eralization and further discussion of stochastic integrals. 

Next, in chapter X, we come down to stationary processes with a discrete 
parameter which are prefaced by a detailed discussion of measure preserving 
transforms. The strong law of large numbers and the Wold-Khintchine 
theorem are proved and illustrated and this is followed by a discussion of 
the effect of linear operators on the spectrum of a process and of processes 
with rational spectra. In the following chapters all these ideas are discussed 
for continuous processes. 

The final chapter is a rigorous discussion of linear least squares prediction 
for stationary processes, a subject which is of great interest in practice. The 
practical applications are, however, not discussed. It is natural to confine 
the discussion to linear least squares prediction but it is worth pointing out 
that there is a need for more research on cases where non-linear prediction is 
better. Consider, for example, a discrete process {z,} where each 2,, is dis- 
tributed uniformly on (—1, 1) and 2_4:=1—2|z,|. Then all the serial cor- 
relations are zero but an exact predictor always exists. There is also a need 
for research into processes which are generated by nonlinear difference equa- 
tions and processes which are not symmetric about their means. 

The author is much to be congratulated on this very important book. 
His aim of giving a rigorous foundation to the subject is, no doubt, mainly 
responsible for there being no space for a discussion of the problem of in- 
ferring the structure of a process from a sample realization, and very little 
discussion of particular processes. As a result much work on processes is 
never referred to, and the names of Yule, Bartlett, Pitt, Quenouille and the 
Kendalls never appear, a fact for which the author rightly apologizes in his 
preface. The printing, binding, and textual accuracy are of the very highest 
quality. 
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Introduction to the Theory of Stochastic Processes Depending on a Continuous 
Parameter. Henry B. Mann. National Bureau of Standards, Applied Mathe- 
matics Series, 24. Washington: U. 8S. Government Printing Office, 1953. Pp. v, 
45. $0.30. 


Utr GRENANDER, University of Stockholm 


pew booklet should be useful for a reader who wishes to find out quickly 
and not in too great detail what sort of questions are dealt with in the 
theory of stochastic processes. Its 45 pages contain a good deal of information 
on this subject. 

After a discussion of some basic concepts, the author defines a stochastic 
process as & one parameter family of stochastic variables and studies various 
linear operations such as differentiation and integration. Some special proc- 
esses are discussed in Chapters 2 and 4, mainly of independent increments, 
and related statistical problems are dealt with in Chapter 3. In Chapter 5 
counter data are considered as forming a stochastic process and Chapter 6 
finally is devoted to harmonic analysis of processes and the mean ergodic 
theorem. 

The clarity of the exposition and the simplicity of the mathematical 
machinery that is used makes the book easy to read. The reader who wants 
to pursue the topic further can find a more complete treatment in two 
recently published books, J. L. Doob’s Stochastic Processes (Wiley 1953), and 
A. Blanc-Lapierre and R. Fortet’s Théorie des fonctions aléatoires (Masson 
et Cie, 1953). 


Small Particle Statistics. Gustav Herdan. Amsterdam: Elsevier Publishing Com- 
pany, 1953. Pp. xxiii, 520. $12.00. 


BENJAMIN Epstein, Wayne University 


I MODERN technology, particles ranging in size roughly from 10 down to 
10-* centimeters play a very important role. For example, the strength 
of glassware, ceramic ware, or cement depends to large degree on the fineness 
and size distribution of the raw material being used. The resistance of dyes 
and paints to weathering and many other physical properties are strongly 
affected by the size distribution of the raw materials used and by the way in 
which they are dispersed throughout the dye or paint. The health of workers 
in a factory is affected by the kind, density, and distribution of pollutants 
(generally fine particles) in the atmosphere. In nature, microscopic soil prop- 
erties are of great importance in sedimentary petrography, in agriculture, 
and in soil physics. The suitability of coke as a blast furnace fuel can be pre- 
dicted to some extent from the kinds of size distributions obtained when a 
sample of coke is broken up into small pieces by the application of various 
breakage processes. It is virtually impossible to study such properties as 
“grindability,” “resistance to impact,” “resistance to abrasion,” and the like, 
without dealing with particle size distributions and how they change, for 
example, with time or energy expanded. 
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The author gives an exhaustive account of the current state of knowledge 
in this field. Since the particles being measured are generally quite small this 
raises many problems. For example, the problem of how to prepare a sample 
for measurement, how to carry out the measurements, and what to measure 
are quite involved technically and important staatistically because of the 
way in which they can affect the data with which the statistician will be 
asked to work. The author treats this aspect of the subject admirably, going 
into such things as sieving, sedimentation, microscopic, and adsorption meth- 
ods, etc. He also considers various ways of recording data whether by size, by 
surface area, by weight, etc., and indicates the physical reasons why one 
measure might be preferable to another depending on circumstances. The 
reviewer can say, from his own experience, that the statistician called upon 
to give advice in this field would do well to be aware of such technical 
considerations. 

Roughly half of the book is devoted to a treatment of elementary statistics 
and the elements of experimental design. This was done by the author in 
order to make the book self-contained. In the opinion of the reviewer, it 
would have been better to eliminate most of this material and advise the 
reader to become acquainted with a basic applied statistics book such as that 
recently written by A. Hald. Specifically, a good deal of the material in 
Chapters 2, 4, 6, 7, 8, 10 could have been omitted or treated with greater 
brevity. By and large the author’s treatment of statistics is sound. The 
author does, however, seem to slip up in the examples on p. 160 and p. 162, 
since one should surely separate out the effect of variation among the 
laboratories in running the tests of significance. Analysis of variance is called 
for and not a simple t-test. 

The statistician will find parts of this book very interesting. Among the 
specially noteworthy features are: (1) critical discussion of how to draw a 
sample, what and how to measure, kinds of errors likely to appear, etc.; (2) 
discussion of the distribution laws arising in particle statistics (Chapter 6); 
(3) discussion of the mechanism of crushing and grinding and associated 
statistical questions (Chapter 13); (4) statistics applied to problems of mixing 
(Chapter 14); (5) consideration of the statistics of polymerized materials 
(Chapter 15); (6) sampling procedure in sedimentary petrology (pp. 
417-429). Another good feature of the book are the many illustrations, ex- 
cellent figures, and extensive bibliography. 

Dr. Herdan wrote Chapters 1-17. Chapters 18-23 were written by Dr. 
M. L. Smith. These latter chapters are devoted to a critical discussion of 
various experimental methods for determining both size distribution and 
surface area. This is a complicated problem in the subsieve range and re- 
quires very delicate experimental procedures. 

To sum up, the book is a “must” for all who work in the statistics of fine 
particles. It should give the statistician a good deal of insight into the prob- 
lems peculiar to this field. It should give the technologists and scientists an 
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appreciation of what statistical methods can accomplish in this area. The 
book should certainly stimulate healthy cooperation. 


U. S. Army, Ordnance Corps. Tables of the Cumulative Binomial Probabilities. 
Ordnance Corps Pamphlet ORDP 20-1, September 1952. Pp. viii+577. 9X12 
inches. For sale by Office of Technical Services, Department of Commerce, 
Washington 25, D. C. at $6.00 per copy. Orders should cite Order No. PB 111389. 


_ is by far the most extensive table of the binomial distribution published 
yet. Cumulative probabilities are available to seven decimals for popu- 
lation proportions from 0 to 1 by steps of 0.01, for sample sizes through 150 
by steps of 1. An Introduction gives a good explanation of the use of the ta- 
bles, and explains their relations to the Incomplete Beta Function Ratio. 
These tables will be indispensable to all practicing statisticians concerned 


with the binomial distribution. 
W.A.W. 


The Theory of Inventory Management. Thomson M. Whitin. Princeton: Prince- 
ton University Press, 1953. Pp. viii, 245. $4.50. 


Rospert DorrMan, University of California, Berkeley 


Inventories and their management go back to Joseph, advisor to Pharaoh, 
at least. The theory of inventory management has a much shorter history, 
however. Up to the 1920’s, inventory policy seems to have been based largely 


on rule-of-thumb. Some of the simpler problems were formulated and solved 
during the 1920’s, but the development of a systematic theory as a branch 
of economic and managerial science was undertaken only after World War 
II, as an aspect of the operations research movement. 

Whitin’s monograph is an introduction to this promising and fast-growing 
field. It is divided into three parts. In Part I, Whitin develops the principles 
of inventory management in the individual firm and compares his conclusions 
with those of earlier writers, particularly Eiteman and Boulding. Part II 
deals with the effects of inventory policies on fluctuations in economic activ- 
ity and the implications of inventory theory for theories of general economic 
equilibrium. In this connection the theories of Keynes, Metzler, and Leon- 
tief are examined in the light of the results of Part I and of empirical data. 
Part III treats the inventory problems of the national military establish- 
ment and applies some of the principles of Part I to them. 

The characteristics of an optimal inventory policy for an individual firm, 
dealt with in Part I, are the foundation of the entire treatment. There are, 
essentially, two issues to be considered in formulating an inventory policy. 
The first of these concerns cost minimization: the costs of carrying large 
inventories have to be balanced against the costs of reordering supplies at 
frequent intervals. If the cost functions are simple enough, the size of order 
which minimizes the sum of the ordering cost and the carrying cost per unit 
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(this is the economic purchase quantity) can be determined by the differ- 
ential calculus. Whitin shows that this leads immediately to two interesting 
consequences: first, the economic purchase quantity and therefore the 
average size of inventories varies in proportion to the square root of the vol- 
ume of sales; second, the superficial dictum that “the higher the turnover 
rate the better” is misleading because it may lead to excessively high reorder- 
ing costs. . 

The second main issue concerns risk minimization: the risks of overstack- 
ing associated with large inventories have to be balanced against the risks of 
depletion associated with small inventories and against the disorganization 
of production and sacrifice of sales and goodwill which depletion entails. 
This is a far more complicated problem than cost minimization. The major 
aspects of this problem are: (1) the measurement of the losses which would 
result from overstocking if it occurs and from depletion if that occurs, i.e., 
the determination of the “loss function” in Abraham Wald’s terminology; 
(2) the estimation of the probability distribution of withdrawals from inven- 
tory, which determines the probability of the various possible losses associ- 
ated with a given inventory policy; (3) the establishment of a policy cri- 
terion, be it minimum-maximum loss, minimum expected loss, or whatever; 
(4) the consolidation of these three aspects into a decision procedure which 
determines optimal inventory policy as a function of observable data. 

Whitin’s handling of the risk problem is different in content and spirit 
from the quadri-partite analysis sketched above. There is little or no dis- 
cussion of the loss function or of the problems of ascertaining it, and the 
worked-out illustrations depend on the assumption that the losses resulting 
from depletion are a known dollar-and-cents unit cost multiplied by the 
amount of the inventory deficiency. It is assumed without argument that 
the objective of inventory policy is to minimize the expected loss, so that 
the issue of selecting a policy criterion does not arise. Nor does Whitin bring 
up the problem of an integrated decision procedure. Instead he handles 
separately the two sub-problems of (a) determining the probability distribu- 
tion of withdrawals from inventory and (b) determining the cost-minimizing 
policy assuming that the probability distribution is given. 

Indeed, Whitin simplifies the problem even further because most of his 
discussion concerns the case, relatively infrequent in practice, in which the 
probability distribution is known in advance. In his most extended treatment 
of a case with an unknown probability distribution, the case of style-goods, 
he recommends simply that the distribution be estimated by asking buyers 
to forecast the maximum amount of sales they foresee and adjusting these 
forecasts in the light of the past performance of the forecasters. (See pp. 
70-71.) 

What is left after all these simplifications is the problem of minimizing 
inventory costs and losses, given the probability distribution of withdrawals 
from stock. The problem is further limited in much of the development by 





BOOK REVIEWS 669 


assuming that the acceptable level of risk of running out of stock has been 
predetermined. The cost of all these restrictions is suggested by the work of 
Dvoretzky, Kiefer, and Wolfowitz, who have shown that if reordering costs 
are appreciable, a policy of a type excluded by Whitin may be optimal, 

Whitin’s simplifications have the virtue of rendering the inventory prob- 
lem amenable to the methods of the differential calculus, or the marginal 
analysis beloved of economists. Even though Whitin’s methods would be 
inadequate in many practical situations, in circumstances in which uncer- 
tainty is unimportant and in which the costs of overstocking and under- 
stocking can be calculated without difficulty they should provide useful 
guidance. Moreover, some of the consequences of the analysis, especially the 
tendency of optimal safety margins to increase according to the square root 
of the level of sales, are generally valid and highly suggestive. 

As applied to the study of economic fluctuations and general equilibrium 
in Part II, the main consequence of the theoretical analysis is that inven- 
tories tend to increase in proportion to the square root of the level of eco- 
nomic activity. Metzler, Boulding, and Leontief? have all assumed, at one 
time or another, that inventories vary in direct proportion to the level of 
activity. Their theories are, therefore, subject to criticism. Whitin also pro- 
duces some empirical data which tend to support his position. He concludes, 
rather tentatively, that although the square-root law mitigates the de- 
stabilizing effect of inventories on business cycles, inventories probably do 
contribute to cyclic instability. 

Part III discusses the inventory problems of the national military estab- 
lishment. The wastefulness of some current rule-of-thumb practices is 
pointed out in convincing detail. In this context Whitin raises one of the fun- 
damental questions which he neglected in his general treatment: the estima- 
tion of the cost of running out of stock of some item or, in other words, the 
determination of the marginal value of an inventoried item. His proposal 
is to apply the methods of game theory, that is, to calculate the value of a 
war game with a given stock of the item and compare this with the value 
computed with the stock increased by one unit. He nowhere mentions the 
use of the closely related methods of linear programming which, also, yield 
estimates of the value of inventories in military and private organizations. 

This is the first full-length treatment of a new field. The exposition is 
generally clear and the mathematics emploved are simple and familiar. Many 
of the results included have already been applied successfully and the treat- 
ment unmasks a number of common fallacies about inventory management 
and behavior. This book can therefore serve usefully as an introduction to 
the field. But the reader should be warned that this monograph contains 





1A. Dvoretsky, J. Kiefer, and J. Wolfowitz, “The Inventory Problem,” Econometrica, 20 (1952), 
187-222, 450-66. 

2 Lloyd A. Metzler, “The Nature and Stability of Inventory Cycles,” Review of Economic Statistics, 
August 1941; Kenneth E. Boulding, A. Reconstruction of Economics (New York, 1950), Part I; W. W. 
Leontief and others, Studies in the Structure of the American Economy (New York, 1953), Chapter 3. 
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only a smattering of what is known today about the theory of inventory 
management and that what is known today is only a smattering of what we 
shall need to know before the theory is ready for wide application. 


The Role of Mergers in the Growth of Large Firms. J. Fred Weston. Berkeley: 
University of California Press, 1953. Pp. xvi, 159. $3.50. 


See review article by G. Warren Nutter on page 448. 


Studies in Income and Wealth, Volume Fifteen. Conference on Research in In- 
come and Wealth. New York: National Bureau of Economic Research, 1953. 
Pp. x, 230. $3.50. 


H. 8. Hoursaxxer, Stanford University 


HIS volume of the well-known series contains eight papers presented in 

1950 at a conference in Allerton Park, Illinois. Although the meeting was 
intended to deal with the distribution of income by size, only one of the 
papers is strictly concerned with that subject, most of the others being de- 
voted to problems arising in the cross-section analysis of the use of personal 
income. It must be hoped that a future conference will go into the original, 
relatively unexplored area, but the quality of some of the papers collected 
here compensates for the change in emphasis. 

The only paper on income size distributions, by George Garvy, is mainly 
expository. Of greater interest is a contribution by D. Gale Johnson, who 
tries to elucidate the low incomes of southern farm families by comparing 
the income of non-farm families in the south and elsewhere. Though ham- 
pered by apparent inconsistencies in the data, he advances some remarkable 
conclusions which are supplemented by comments from other students of 
regional incomes, who present new data. 

In a paper written in 1935, but not hitherto published, Milton Friedman 
suggests a new method of ranking families of different composition by their 
relative economic status. Following Sydenstricker and King’s pioneering 
article in the 1920-21 volume of this Journal, he proposes the estimation of 
weights for each class of family members such that, when both income and 
a particular item of expenditure are divided by the sum of the relevant 
weights, the resulting relation is independent of family composition. As is 
recognized by the author and confirmed by calculations in Jean Mann Due’s 
comment this method leads to inconsistent results because the weights de- 
pend on the expenditure item considered. It is therefore surprising that 
Friedman rejects, on “pragmatic” grounds, a more acceptable definition, 
which distinguishes between income weights and specific weights. The latter 
approach occurs in R. G. D. Allen’s contribution to the Schultz memorial 
volume (Studies in Mathematical Economics and Econometrics, Chicago, 
1942) and has been used with some success by authors associated with the 
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Cambridgc Department of Applied Economics (particularly by 8. J. Prais 
in the Economic Journal of December 1953). 

It is perhaps even more surprising that Professor Friedman should have 
been so ready to infer the “economic status” of different families from their 
incomes and expenditures only. Since these data by themselves show only 
shifts in demand functions, not effects on satisfaction, any such endeavor 
necessarily involves additional, usually unstated, assumptions, which reflect 
nothing but the preconceptions of the investigator. The objections to inter- 
personal comparisons of utility apply with full force here, especially because 
the presence or absence of children may itself have a strong influence on 
well-being and this influence cannot normally be isolated. 

Much the same point has to be made in connection with Mollie Orshan- 
sky’s attempt to find “equivalent” levels of living for farm and city dwellers. 
Except by asking highly sophisticated questions there is no way of deter- 
mining the income at which city families are as well off as farm families with 
a given income. The particular method followed in this paper, first suggested 
by Dorothy S. Brady, is based on the alleged existence of income levels at 
which the income elasticity of quantity (as distinct from expenditure) 
reaches a maximum. Quite apart from the statistical difficulties in estimating 
them, the mere fact that these income levels, if they exist at all, are not the 
same for all commodities proves that they have no welfare significance 
whatever. Nevertheless Miss Orshansky’s calculations bring out some in- 
teresting features of household expenditure patterns. 

Data collected by the Survey Research Center of the University of Michi- 
gan are analyzed by Janet A. Fisher, who is concerned with the incomes, 
savings, and assets of families during their economic life cycle. Most of the 
figures are classified by the age of the household’s head, and the remarkable 
regularities that appear agree fairly well with a priori expectations. Further 
research should reveal whether the use of other indicators of the economic 
age of the family, such as the duration of the marriage and the number of 
children, will yield still more informative results, but Miss Fisher’s data did 
not allow such analyses. The importance of this field of investigation is 
underlined by recent developments in the study of the consumption function, 
which emphasise its dynamic aspects. 

Dynamic factors, though of a different nature, are also considered in an 
ingenious but obscure contribution by Dorothy 8S. Brady. By comparing 
budget surveys from various periods she tries to identify a “normal form” 
of the consumption function, in which not only the level of but also the 
change in income is regarded as a determining variable. Not having any data 
on income changes at her disposal, she apparently allows for the latter effect 
by using community income as well as family income. The precise logic of 
her method is unfortunately not made clear. It might be suspected, for in- 
stance, that in comparing the 1917-19 with the 1935-36 survey changes in the 
price level should be taken into account, but nothing is said on that topic. 
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Nor is the reader’s bewilderment allayed by cryptic footnotes such as “Ex- 
penditures and savings were standardized to an average family size of 3.5 
persons.” It is quite possible that Mrs. Brady’s approach leads to impor- 
tant results, but the present exposition is mystifying rather than convincing. 

Margaret G. Reid’s valuable paper illustrates J. R. Hicks’ dictum to the 
effect that the income one can calculate is not the income one seeks, whereas 
this true income escapes calculation. She starts from the observation that 
the income elasticity of consumption is less for farm families than for others, 
and analyzes to what extent this may be due to the income concept used 
rather than to differences in behavior patterns. The choice of concept is 
particularly important in the case of farm families because of the difficulty 
of separating operating and living expenses and the related problems of de- 
preciation and inventory changes. Similar questions arise for all families 
with highly variable incomes. Miss Reid surveys some alternative classifica- 
tions of families with reference to the resulting biases of this nature, but she 
defers a definite recommendation. 

In a concluding paper Simon Kuznets outlines directions of further in- 
quiry. His stimulating suggestions, which demonstrate a penetrating insight 
both in what is most necessary and in what is feasible in empirical research 
are too numerous to be discussed here. He might have put a little more em- 
phasis on theoretical research, however; even if this does not lead to immedi- 
ately applicable conclusions it may yet help to clarify the issues and methods 
involved. Several of the contributions to this volume would have been im- 
proved if the models used had been more explicitly formulated. As it stands 
the volume shows impressively what a wealth of basic information is already 
in existence; the main problem for the near future is how to exploit it more 
effectively. Despite their various, and mostly excusable, defects these papers 
constitute a significant advance in the solution of that problem. 


Better Population Forecasting for Areas and Communities. Van Beuren Stan- 
bery. (Domestic Commerce Series No. 32). U. 8S. Department of Commerce. 
Washington, D. C.: U. 8. Government Printing Office, 1952. Pp. iv, 80. 25 cents, 
Paper. 


FREDERICK F, StepHan, Princeton University 


HE problem of making forecasts of population growth for cities, counties, 
jp districts frequently arises in the work of local industries, 
public utilities, zoning and planning agencies, and municipal organizations 
concerned with the provision of schools, water, sanitation and other services. 
Frequently they find it necessary to plan far ahead because their facilities 
must be built in single units that can not be enlarged without excessive ex- 
pense. Forecasts of local population growth are affected greatly, of course, 
by the factors that influence the location of new industry and the expansion 
of business and other factors that affect employment in, and migration of 
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population to or from the area. In addition to these factors, there, are of 
course, the natural growth of the resident population and the progression 
of each cohort through the ages at which it enters the labor force, establishes 
new households, and otherwise affects the total demand for the services for 
which the population estimates are being prepared. 

If the lot of the forecaster of population growth for the entire nation is a 
difficult and unhappy one—and recent experience would seem to make this 
undeniable—then the life of a forecaster of local populations must certainly 
be unbearable. For such a hazardous occupation it would seem that any ac- 
cumulation or accretion of knowledge and any tricks or devices no matter 
how crude, would be helpful. While it is too much to hope that a highly 
scientific methodology could be put together for forecasting populations in 
all kinds of communities, there are some communities in which special factors 
do permit highly accurate forecasting if full advantage is made of them. 
Nonetheless, many statisticians look with misgiving at the relatively crude 
methods of estimation that must be employed, perforce, by anyone so brash 
as to attempt to make local population forecasts. 

The booklet that is under review is a contribution to “those who use or 
make population projections.” It reviews some of the more familiar methods 
of forecasting, such as the projection of past population growth, projection 
by use of the relation to national or regional areas, for which forecasts 
are already available, projection of migration and natural increase, and 
projection from specific estimates of future employment. Work sheets are 
sketched for these computations. Much advice is given about things to 
consider. It is largely a manual of suggestions and cautions written in un- 
sophisticated language. 

This guide book will be welcomed by many who want to see “better popu- 
lation forecasting.” Perhaps it is not without its liabilities, however, for it may 
encourage mediocre work at the expense of really good analysis. The bibli- 
ography lists a number of studies that would have increased the value of 
the text if they had been used adequately as examples. Moreover, there is 
little evidence that the author has looked into the forecasting that is done 
by the telephone and utility engineers or some of the regional business re- 
search bureaus. The methods of projection could well be improved by use of 
data correlated with population growth, such as meter installations, to fill 
in the gaps between censuses and to permit more effective linkage to economic 
projections for the area. If this is done then more efficient statistical methods 
can be used. All this suggests that local estimates should be prepared with 
the advice and assistance of those who can contribute all the data and tech- 
niques and judgment that are available in the community and pertinent. 
Such a group would be likely to emphasize the width of the band of uncer- 
tainty that cloaks the forecasts, an aspect that this booklet very properly 
does too. 
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Population Changes in Europe Since 1939. Gregory (Grzegorz) Frumkin. New 
York: Augustus M. Kelley, Inc., 1951. Pp. 191. 


Dovaetas F. Down, Cornell University 


_— study would appear, at first glance, to be one of modest compass, 
dealing with assuredly useful but nevertheless dreary data. In Mr. 
Frumkin’s hands the data come alive, his task is revealed as a staggering one, 
and one leaves the book shocked anew by at least one meaning of Hitler’s 
War: the death and displacement of scores of millions of people. 

Mr. Frumkin—now with the United Nations, and editor of the Statistical 
Year Book of the League of Nations throughout its existence—attempts “a 
systematic determination of the magnitude of the changes which have oc- 
curred in Europe’s population since 1939 and of their determining factors 
... country by country, on a uniform pattern, by means of balance 
sheets...” (p. 9). 

In the brief first chapter, the population background of Europe (before 
1939) is discussed, separate charts for twenty countries showing birth and 
death rates and the natural increase of population are presented, and we are 
furnished with a handful of pithy remarks and caveats hinging on the de- 
mographer’s craft, but worth pondering by all those who study society: 
e.g., “there is sometimes more stability behind changing figures and greater 
mobility behind stable figures than one would imagine” (p. 19). 

Chapter II provides a lucid and interesting discussion of the balance sheet 
approach of computing population changes. This approach—“somewhat 
arduous, strictly inductive’—the reviewer found easy to follow, clear, and 
highly informative. The method may be illustrated by listing the items in 
the French balance sheet, 1939-1945: Population, end of 1938; Births; 
“Normal” deaths; War losses (military; civilians, non-Jewish; Jews killed); 
Population shifts, net balance; Population end of 1945. Similarly, with some 
different inclusions, for 1946-1947. Items are tallied plus or minus. 

Chapter III is the heart of the book. Here we find detailed balance sheets 
for twenty-four European countries, and a careful discussion of relevant data 
for the U.S.S.R. The distinction between the latter and the former rests on 
the non-availability of statistics from the U.S.S.R. Each item on the balance 
sheet is systematically explained, the reliability of the figures discussed, and 
statistical conclusions are drawn. One cannot fail to be impressed by the 
meticulousness of Mr. Frumkin’s research, and his success in achieving an 
“unbiased enquiry”. On a different note, one cannot read these pages without 
becoming sickened by the violence and destruction and misery underlying 
the figures. 

Chapter IV summarizes the results of the study, in a valuable combined 
balance sheet for all countries, and goes on to say something of the meaning 
of the figures and the prospects for the future. Some of these remarks are 
worth quoting here: “In spite of heavy losses, Europe . . . emerged from the 
turmoil of war with a larger population in 1947 than on the eve of the war.” 
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(p. 175). “A characteristic feature of the last war was. . . that the main loss 
of population was due not to fighting, but to mass-murder by the German 
invader ... Civilian losses were overwhelmingly concentrated in areas oc- 
cupied by the Germans, and the number of Jews murdered made up almost 
one-half of the total civilian losses. In Poland alone the number of Jewish 
victims exceeded 3 millions” (p. 182). “War and genocide . . . accounted... 
for the death of 15 million persons, almost 6 millions being military deaths, 
and over 9 million deaths among civilians” (p. 181). Add to this the 17 
millions of war losses estimated for the U.8.8.R., which Frumkin estimates 
as being “on the low side” (p. 164). Add the millions in the “exceptionally 
large shifts of the population chased across national boundaries” (p. 188). 
The book concludes with a grim warning: “As the result of World War II, 
national minorities in Europe have mostly disappeared. Mass-murders, 
shifts of frontiers and ruthless mass-transfers were the instruments by means 
of which the former European mosaic has been converted into an array of 
ethnically homogeneous units each with the sign: ‘Trespassers will be prose- 
cuted.’ These units, born under coercion, cannot be maintained ethnically 
pure except through coercion” (p. 190). 

Mr. Frumkin throughout emphasizes the tentative nature of his con- 
clusions. Tentative though they may be, the diligence, the years of experi- 
ence, and the considerable talents of Mr. Frumkin which come through on 
every page would seem to indicate that it will be Frumkin or those following 
his methods who will improve upon what looks to be a definitive work. 


Bristol on the Move—A Travel Survey. British Transport Commission. London: 
1953. Pp. 46. 10s.6d. 


Lreonarp P. Apams, Cornell University 


ype study was authorized by the British Transport Commission and car- 
ried out by Research Services Ltd. in the winter of 1950-1951. It was 
designed to provide information on the travel patterns of the people of 
Bristol (defined for present purposes to include those residing within a ten- 
mile radius [more or less] of the city) of interest to the sociologist, to the op- 
erator of public services, and to the advertiser. The principal findings with 
respect to the methods used in travel, costs to the traveler, types of travelers 
(age, sex, income class, etc.), purposes of travel, time required, and other 
items are presented in a series of tables supplemented by a narrative discus- 
sion and some excellent photographs of the city and its environs. 

No doubt the data obtained from this survey have been of service to trans- 
port operators and advertisers, and probably to some extent to sociologists. 
From the standpoint of those not particularly interested in the Bristol area 
per se but interested in local studies of travel patterns, the design of the 
Bristol survey and the methods used will be of interest. This reviewer has 
recently been studying commuting patterns of industrial workers, so is prob- 
ably inclined to be more critical of some of the methods used than those 
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interested in administration or selling, although in some significant respects, 
as will be noted, the study has weaknesses from their point of view also. 

In designing the study Research Services Ltd. evidently assumed that 
there was little value in explaining the geographic limits of the Bristol area. 
This assumption may be well founded in fact but there is no explanation of 
why the ten mile limit was chosen or why the possibilities of longer distance 
travel to or from Bristol were ruled out. The authors admit that there is 
considerable traffic between Bristol and Bath which has been excluded from 
the study because the City of Bath “has transport services of its own which 
would have to form the subject of a separate investigation.” But there 
is no way of telling from the report whether or not some of the households 
surveyed have closer economic ties to Bath than to Bristol even though 
they live within the ten mile radius from Bristol. Data on travel are not 
given in terms of places of origin and destination. Omission of such infor- 
mation, together with a predetermined definition of the Bristol area, raises 
questions about the scope of the area and the relationship of its people to 
other population centers that the published summary does not answer. To 
what extent do the people of Bristol have economic and social ties with those 
in Bath? How much cross traffic is there? For purposes of advertising and 
labor supply marketing, should Bath and Bristol be combined in a single 
area? 

Presumably the investigators considered that for their purposes a more 
or less arbitrary definition of the Bristol area would be adequate. In any case, 
it facilitated the selection of the sample households to be visited. The primary 
sample was chosen by taking every 150th address from the Electoral Reg- 
isters covering the sample area. A secondary sample consisting of the fifth 
address following each primary address was also selected at the time the 
primary was drawn. When the householder at the primary address could not 
be reached or would not cooperate the nearby secondary sample was sub- 
stituted. In case there were two or more households at the same address, 
additional interviews up to three were taken in order to avoid weakening 
the geographic distribution of the sample. Such additional interviews were 
called “subsidiary” interviews. Each member of the household was ques- 
tioned concerning his or her use of public transport in the seven days pre- 
ceding the interview. The methodology used was, in general, the same as that 
followed in the London Survey, except that all use of public transportation 
in the preceding seven days was recorded instead of just the “regular” jour- 
neys. Although no measures of the adequacy and reliability of the sample 
are given, it seems probable that the results give a reasonably good picture 
of the characteristics of travel within the area selected. 

While a single cross-section view of travel patterns has value from the 
standpoint of operators and advertisers, it has severe limitations if one is 
trying to understand trends in housing location, the journey-to-work, mode 
of travel and other matters. Travel patterns in the Bristol area, if they are 
at all similar to those in industrial centers in this country, are probably dy- 
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namic, changing with such factors as new housing developments and the 
location of new industrial plants—and, judging by some of the photographs 
in the report, the Bristol area has had some interesting post-war housing de- 
velopments. Changes in travel patterns probably could be measured with a 
fair degree of accuracy by the interview method. Experience with travel 
studies in this country suggests, however, that employer personnel records 
provide information more readily with respect to journey-to-work patterns 
and they also show changes over time. If similar records are available in 
English plants, Research Services Ltd. in making future studies may wish 
to use this source to provide a basis for defining the geographic limits of the 
area to be surveyed and also to show changes in the distances traveled and 
characteristics of the work force empioyed. 


International Shipping Cartels. Daniel Marz, Jr. Princeton, N. J.: Princeton 
University Press, 1953. Pp. xiii, 323. $6.00. 


Rockxwoop Cuin, Berea College 


H™ is a scholarly book about shipping conference agreements as a form 
of international cartel. This topic is usually neglected by books on inter- 
national cartels and inadequately treated in transportation texts. The author 
is to be commended for his comprehensive description of the nature, history, 
purposes, and methods of shipping cartels. Competition from non-conference 
liners and tramp shipping is also discussed. 

Particularly detailed is his analysis (in Chapters IV, V, and VII) of Ameri- 
can and British experience in the investigation and regulation of shipping 
practices dating from about the beginning of the twentieth century. All the 
main modern arguments for and against shipping cartels are virtually cov- 
ered in the early reports of the Royal Commission on Shipping Rings (1909) 
and the Alexander Committee (1914). Out of the recommendations of the 
latter there evolved in the United States the Shipping Act of 1916 which, 
among other things, prohibited deferred rebates, “fighting ships,” and unjust 
discrimination. Curiously enough, there is no specific reference to the Bland- 
Copeland Bill of 1935 which unsuccessfully attempted to reverse the 1916 
legislation. 

Chapter VI includes brief uneven sketches of restraints imposed upon 
national participation in shipping conferences by other selected courtries— 
British Dominions and some European and Far Eastern countries. The 
economics of shipping conferences, discussed in Chapters II, III, and XII, 
supplemented by a note in the Appendix on discriminatory pricing, provide 
a theoretical background for the factual presentation. 

The reader may jump to Chapter XIV for the conclusions of the book 
without loss of the main argument. In trying to be objective, the author 
sometimes appears to vacillate in his attitude toward international shipping 
cartels, and there is occasional ambiguity as to what he is definitely recom- 
mending. Generally speaking, he regards international shipping cartels as 
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inevitable, capable of abusing their privileges, and also of suffering from 
wastes of competition. Monopoly and cutthroat competition are both de- 
cried. “The record seems to indicate that relatively few conferences have 
been guilty of very serious abuses, unless exclusive patronage contracts and 
pooling arrangements approved by the Commission are so condemned” (p. 
136). Regulation by national governments and an international agency (at 
the time of writing, the Intergovernmental Maritime Consultative Organiza- 
tion had not yet come into existence) would both have their limitations, 
though desirable for supervising tying arrangements and for preventing 
monopoly abuses and undue discrimination. Some sort of improvement in 
self-regulation through the existing conference system is favored, but the 
author does not make clear how this can actually be attained. 

Chapter ITT, on the general economic and political environment, deals with 
the relation of shipping to international trade, location of economic activity, 
and balance of payments. It contains the bulk of the book’s trade and ship- 
ping statistics. Elsewhere statistics are few. Though there is a table in Chap- 
ter XII comparing fluctuations in indices of tramp freight and conference 
general cargo rates, actually freight rates are cited rather infrequently. In 
Chapter IX there are also charts showing the names and membership of 
various freight and passenger conferences participating in United States 
foreign trade and travel. One misses, however, the complete text of a confer- 
ence agreement of the type described in the book, of which we are told 
there are over one hundred on file at the Maritime Commission (now the 
Federal Maritime Board). 

A word of caution on figures given as shipping company earnings of 
foreign exchange: The early British figures (p. 43) really refer to total earn- 
ings by British shipping companies in carrying exports and imports, as well 
as inter-third country trade, and are therefore not all additions to foreign 
exchange. Of course, the British method of c.i.f. valuation of imports com- 
pensated for the overstatement of transportation credits in the balance of 
payments. The 1949 shipping earnings figure, calculated on an f.o.b. basis, 
is not comparable to the earlier figures. Similar remarks apply to the Nor- 
wegian data cited on p. 42. Furthermore, in singling Norway out as the 
country for whose economy shipping is of the greatest importance, the ratio 
of shipping foreign exchange earnings to national income would have been 
a more significant criterion than net earnings figures alone or their ratio to 
the trade balance. 

On the whole this is a very fine book on a much-neglected subject. A 
wealth of information has been pieced together from books, documentary 
reports, government agencies, and case studies. It fills a need in the literature 
of international cartels and transportation. Students of economics, govern- 
ment, and transportation will find it very useful. Since the author can already 
point out some comparisons with the international aviation field, I hope that 
he may someday also publish a treatise on international air transport cartels 
on the same high level. 
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RANDOM DIGITS (17,376-20,875) 
From A Million Random Digits to Pe Papbliahed for The Rand Corporation toward the end of 1954 by 





Glencoe, 
04808 99531 47991 46064 80467 
71924 64882 94893 82935 99076 
56410 89552 28404 74525 74212 
38851 16144 99542 27481 21992 
91428 10589 09454 43308 66753 
40083 17141 30702 31997 69856 
93419 10474 41796 88285 02448 
03704 65516 65448 20203 21189 
78181 90060 74904 42627 16638 
45972 93572 76011 03426 50226 
60898 63968 62264 64603 51866 
40398 54180 65869 87977 02799 
68245 76912 01222 59516 36438 
27019 15248 66444 25267 05171 
99868 88894 43769 52239 05919 
87904 74135 53842 59520 23979 
68851 41049 97190 53984 04773 
71742 57223 66599 86071 01901 
02742 48803 17823 22093 43907 
56181 96052 67211 61712 54590 
55355 61548 55988 47309 23749 
78961 41072 09876 18903 30292 
92654 97226 53434 77025 63892 
13757 37719 84450 02697 60309 
05776 85945 74651 00216 50842 
71039 83083 60427 78495 99809 
61672 01184 46438 27698 40652 
42988 77983 58708 42176 67356 
13652 16640 27896 26907 86760 
53186 97859 97213 19859 41037 5 
47890 10690 26486 38744 25943 
65654 34629 88831 97253 67282 
00324 17120 39900 67135 42772 
48244 26191 88421 90491 83290 
64081 47704 15018 45600 17241 
60617 06414 56596 63011 24193 
72860 18452 42983 23931 11789 
04631 §5283 19605 34163 86540 
06884 15444 09310 17048 24243 
26611 09551 82626 38194 58432 
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RANDOM DIGITS 


58932 
73073 
42665 
59985 
50943 


22224 
24473 
38582 
46094 
91061 


00397 
14328 
88534 
97347 
01366 


37106 
06476 
81717 
51583 
50120 


89761 
08943 
71685 
17402 
52606 


66035 
21565 
88735 
50404 
80834 


26872 
16530 
84644 
88620 
22209 


04795 
54291 
30654 
11123 
56577 


58987 
16851 
02104 
54440 
87681 


24337 
62557 
02913 
68706 
05930 


89326 
52171 
80748 
59807 
40422 


02627 
42096 
21871 
43845 
31674 


56753 
44708 
87112 
87316 
72976 


20523 
70603 
48410 
69788 
33884 


23053 
66660 
97247 
66300 
39860 


07223 
30786 
75275 
80166 
11317 


72927 
96086 
00448 
72894 
78590 


53971 
56045 
48543 
08732 
51257 


02026 
99197 
49435 
07893 
42543 


61634 
25292 
03885 
87619 
33213 


33491 
89301 
77622 
60562 
63035 


91576 
76920 
14672 
91838 
73729 


53158 
72952 
68614 
73087 
01868 


21584 
97122 
94516 
41758 
83655 


77480 
11057 
79368 
94385 
92127 


76264 
45403 
03080 
28017 
93109 


79021 
17329 
86828 
94716 
68615 


14592 
61635 
18339 
49393 
83291 


42969 
70476 
77706 
31618 
69847 


52574 
72781 
58822 
13846 
78416 


04617 
74066 
15779 
85747 
60344 


16781 
88864 
93362 
79574 
99315 


71872 
27048 
83073 
77135 
51667 


93712 
44978 
15427 
55004 
88345 


28683 
98849 
43710 
01717 
42588 


29148 
33782 
77653 
52611 
91857 


51571 
87959 
50552 
84622 
58113 


39634 
32186 
65024 
12911 
12329 


59144 
77113 
18924 
35707 
81848 


83649 
17186 
82941 
56197 
00194 


82717 
37361 
94028 


89184 
54164 
67981 
08003 
16699 


68153 
67887 
88794 
71883 
63279 


83654 
78028 
85323 
30992 
69602 


68324 
29499 
80365 
96191 
93307 


68652 
93424 
55430 
60012 
47904 


68825 
23727 
84832 
49771 
23727 


03855 
86651 
33386 
75803 
17827 


84349 
46320 
24957 
65130 
32034 


28725 
10393 
43806 
27151 
91369 
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70119 31347 12659 11574 70052 
98390 30240 28330 41145 16918 
08172 23823 48433 57222 34435 
21238 19051 50768 40807 88681 
79342 44640 93942 97371 16842 


93039 79367 00812 41365 04515 
62865 09576 97207 33739 78345 
00800 72496 24767 61768 07228 
64340 02224 48336 14891 72188 
92168 52692 31224 12185 43065 


20494 18813 16242 40257 66402 
87693 30242 70545 69128 51528 
05567 05561 82071 07234 67690 
85166 37189 75671 33879 27411 
26704 47922 56650 40236 66207 


01047 81624 77395 62310 41501 
58183 21952 84098 28913 55736 
64667 57092 21315 04731 71877 
27149 13843 09817 - 09407 88276 
66232 80293 74502 36925 60184 


40500 21406 00571 87320 81683 
35892 49668 83991 72088 30210 
54819 26094 51409 21485 94764 
64224 47909 09994 23750 17351 
36913 58173 45709 83679 82617 


64254 64745 10614 86371 43244 
82018 25536 74031 31807 70133 


28833 44043 96215 21270 59427 
96879 27659 95463 53847 40921 
95938 76014 99818 16606 19713 


97154 71237 06073 57343 51428 
78790 17026 59008 28543 11576 
25034 59325 08844 95774 49323 
70116 44091 88505 15575 44927 
66904 23000 73259 68626 98962 


91171 28299 62619 81550 46798 
74547 13260 79262 55831 83784 
30448 14154 75795 39465 82353 
06584 29867 45898 66415 89349 
68548 86576 14344 75889 04514 


49319 50206 22024 56124 50749 
81034 86779 34622 70859 33045 
68905 44234 18244 31602 38388 
88530 72096 44459 31449 93182 
37227 11302 04667 32526 64713 


83220 50529 20619 11606 10297 
66703 30017 35347 35038 16648 
69556 76728 60535 59961 76979 
99040 96390 65989 38375 30332 
85185 72849 58611 31220 66108 





